Council for Tobacco Research
Statistics, Then and Now the American Statistician [St Duplicate of 11320352]
Fields
- Depository Date
- 31 Oct 1996
- Request
- 4
- Master ID
- 11320359-0364
Related Documents: - Box
- 214
- Type
- SCIENTIFIC ARTICLE
- UCSF Legacy ID
- wrl6aa00
Document Images
5~atistics, -1 her. and Now
EDWIN B. WILSON*
I learned of the year 1839 very early in life. It was
the year of my father's birth. He was not a statistician
but a student of the Greek and Latin classics, and my
schoolina like that of most of my contemporaries who
were fitting for college was chiefly classical. There was
Greek and Latin, arithmetic, algebra and geometry, and
a modern languabe but no science.
The first data I read in the preschool days were in
the marbins of the Bible. The creation was in 4.001. B.C.,
already in the agricultural age. The ancient Jewish
historian had not heard of the pre-agricultural ages, but
he may have been no worse off than the chronolo;ist
Bishop Ussher (Usher) of 1581-1656 whose biblical
chronoloay was published shortly before his death. Tl:e
cosmogonists are always giving us dates for future cos-
mobonists to revise. Then there was the universal flood
that destroyed all animal life except for the inhabitants
of Noah's ark in 2349 B. C., an event of which we
are far from certain, albeit sure it did not occur at
that time. And Joseph went down into Egypt in 1729
and, being a smart fellow, was well received only to
have his whole tribe driven out in the exodus of 1460.
By this date the chronologist may perhaps be not more
than 250 years off. ~r,~1 p -.
There are the more recent statisttical estimates by our
best demographers which turned~o~t none too accurate.
Sometime around 1937 fl'rae-'bf'edt planner, President
Roosevelt, asked his National Resources Committee to
forecast our population to 1980 so plans could be made
for taking care of it. The best demographers were then
Thompson and Whelpton of the Scripps Foundation for
Research in Population. In great detail, as was desired,
they made 7 forecasts on seven different sets of hypoth-
eses assuming thN past characteristics of the population
known up to 1935, leading by extrapolation to the 7
tlistributions thev found. Leaving out the detail of age
distribution, I give only the totals for the whole United
States. The actual census fi;ures for 1940, 1950, 1960,
to the nearest million are 132, 151, 179. The forecast
for 1940 was naturally about right. For 1950 the 7 fore-
casts varied from 136 to 144, with the census at 151
icell, above the highest. For 1960 the 7 forecasts were
from 137 to 155, but 179 was the census figure, and
this was above the highest estimate for 1980, which was
174. The details can be found in the first chapter of
that 305 page quarto Government monograph "The Prob-
lems of a Chanf;ing Population". The sibnificance of the
participial adjective Changing was that, relatively speak-
ing, children were to be fewer and old folks more plenti-
ful than in the past.
* Dr. Wilson died on December 28, 1964, one month after pre-
senting these remarks at the Boston Anniversary Meeting of ASA.
See the memorial article on'Dr. Wilson by Jane Worcester in this
issue.
Well, the adjective was right, the abe distribution
cltan,ed. Not long ago I received from Health, Education
and Welfare an advertisement for their "Trends" in
which they, too, gave population forecasts probably by
able demographers of the present beneration. For 1980
there are 4 in number, varying from 231 to 273. Possibly
this sort of statistical work can be done better now than
37 years a;o. The increase from 179 in 1960 lies between
52 and 94 which allows a variation of 29°fo to 54%
during the 20 years 1960 to 1980; whereas Thompson
and Whelpton basing or. 1935 at 127 allowed only a
variation of 8% to 22% for the 25 years from 1935
to 1960. The HEW forecasters are clearly far more cau-
tious, much nearer to saying "we can't forecast". By
1980 Changing may still be a good adjective-enough
room has been allowed to make quite a difference-but
what the 1980 figure will be we do not know.
ana~
After graduation from Harvard I went to Yale where-'
1839'came to mean the year of birth of J. Willard Gibbs,
the great mathematical physicist, with whom I worked.
He was working intensively on the preparation of his
book "Elementary Principles of Statistical Mechanics",
which appeared in 1902 as one of the volumes in celebra-
tion of Yale's Bicentennial. Statistical here means proba-
bilistic, as in the kinetic theory of gases; it did not
imply data. And elementary did not mean easy. Professor
Boussinesque with whom I studied in Paris in 1902-3
said that he had a copy of the book and it would prob-
ably take him 5 years to absorb its full significance.
The volume came out just before the quantum theory;
it was written in terms of' the current dynamics, butl
so fundamental was the, analysis that it was actually
useful when the quantum theory came along. In point
of fact, however, there was one point at which Gibbs'
theory did not check with the facts, and he well knew
it; the ratio of the specific heats of polyatomic gases
indicated fewer degrees of freedom in the molecule than
the dynamical theory implied, a matter I had heard
Gibbs say had troubled him for 30 years without his
seeing any way to explain it. He died in 1903; had
he lived another decade, he would have seen the quantum
theory explain it.
f
When I turned my attention to that very diflicult field
of medical, public health and socia' -' ;tistics, 1839 came
..
up again as the year of birth o: ::;at great, versatile
genius Charles S. Peirce-logician, mathematician, beode-
sist and philosopher. I had known of his work in logic
and mathematics when I was a very young instructor
in mathematics at Yale before I was transferred to teach-
ina tlle work in mathematical physics of Willard Gibbs
after his death. But what interested me in the - 1920s
at the Harvard School of'Public Health in Peirce's work
,
was his paper "On, the Theory of Errors Of Observa-
; 37
E

tions" in the Report of the Superintendent of the U. S. .
Coast Survey for 1870, in which 24 series of about
500 reaction times were listed. It is not too easy to
come by a large series of a large number of observations
for statistical analysis. The results of working up the
series (published in Proc. \atl. Acad. Sci., vol. 15, pp.
120-5, 1929) showed that the 24 frequency functions
were definitely skew by the ordinary tests, had practically
uncontrolled values of the-kurtosis, and that on the whole
the medians were as well determined as the means, and
finally that the daily means or medians and their stand-
ard deviations varied very much more than they should
have by the usual sampling formulas and even by those
expressed in terms of the fourth moment instead of using
those depending only on the standard deviation and the
assumption of normality. How it was that Peirce could
have considered that he was proving the normal law
I had no idea.
Besides these three birth dates there was one event
in 1839 in addition to that which we are celebrating
today which has been of great statistical significance to
me. I refer to the letter dated February 28 from Gauss
to Bessel, a fellow professor of astronomy who had sent
Gauss some work he had done on errors of observation.
As we all know, Gauss had published his Theoria lblotus
in 1809 using the normal law but in his Theoria Combin-
ationis in 1823 had pursued a different course. In his
letter of 1839 he wrote that he had never publicly ex-
plained why he had ";ab8}~doned the metaphysics applied
to the method of least squ~r~es" in his discussion of errors
of observation in 1809 and~replaced it by the principle
~
~imis obnoxiae). I will not
of least cost (errorabu's`mzn,
quote the whole pairagraph given on pp. 523-4 of the
Briefwechsel xwischen Gauss und Bessel. Gauss had be-
come not only a very great mathematician but a dis-
tinguished scientist.
I do not believe I knew when our Association started
jectandi, often cited as the first on probability, and wrote
Leibniz asking if he could not treat an observed rate
as a probability. You know what Leibniz replied-that
be could not, because although there were recurrences
in Nature they were never exact.
The trouble with statistics is that it is so very old
and still consists of data which have no common laws
that are of much use in interpreting it. Vital statistics,
economic statistics, social statistics, meteorological sta-
tistics, astronomical statistics, etc., all depend for their
proper interpretation upon the state of knowledge in their
respective fields of observation, and often upon the knowl-
edge in detaill of some small part of the large field,
much more than upon mathematical calculations as Leib-
niz told Bernoulli; indeed it often happens that one can-
not make the proper observation or even the proper cal-
culations upon them unless one has a great deal of back-
ground of knowled ;e.
It is getting easier to make the calculations, what with
the decimal notation, logarithms, slide rules, desk com-
puting machines, and the great computing machines
which are getting larger in capacity as they get smaller
in size due to mina, iaturization. The question is: Are we
~' getting smarter as fast as our tools are improving or are
we becoming more likely to cut ourselves intellectually
to bits with the increasing sharpness of the tools?
Statistics was statistics long before there was mathema-
tical probability, and statistics will remain statistics in
the sense that we need better data the better our equip-
ment to handle it. Many fields of statistical study today,
as always, have very little demonstrated background of
controllable causation. I will close with a quotation from
Yule's essay "The Function of Statistical Method in
Scientific Investigation" (Industrial Fatigue Research
Board, No. 28, 14 pp., H.M.S.O., London, 1924).
The unhappy statistician has to try to disentangle the
effect from the ravelled skein with which he is presented.
No easy matter this, very often; and a matter demanding
, not merely a knowledge of method, but all the qualities
that an investigator can possess-strong common sense,
caution, reasoning power and imagination. And when he
has come to his conclusion the statistician must not forget
his caution; he should not be dogmatic. "You can prove
-anything by statistics" is a common gibe. Its contrary
is more nearly true-you can never prove anything by sta-
tistics. The statistician is dealing with most complex cases
of multiple causation. He may show that the facts are in
accordance with this hypothesis or that. But it is quite
another thing to show that all other possible hypotheses are
excluded, and that the facts do not admit of any inter-
at the time I was its president 35 years ago, and I do'
not now know who were those who started it or why .
they did so. Insofar as they were practical men interested .
in affairs, whether public or their own, they probably
were not interested in theory so much as in practice.
In this sense statistics is indeed very old. There was
John Graunt back in 1662 who- wrote that remarkable
book on Vital Statistics, though not so called, of which
the three hundredth anniversary was celebrated by the
Royal Society two years ago this month. This was consid- .
erably before Jacques Bernoulli was writing his Ars Con-
, 38 The American Statistician, April, 1965
pretatiori other than . the particular one he may have in
'
mind.
