A Study of the Models Used in the Analysis of Certain Medical Data Ingram Olkin Stanford University
Date: Jan 1976 (est.)
Length: 2 pages
Jump To Images
Length: 2 pages
Jump To Images
- Olkin, I.
- LEGAL DEPT FILES/BASEMENT GMP
- SCRT, SCIENTIFIC REPORT
- BIBL, BIBLIOGRAPHY
- LIST, LIST
- Named Person
- Date Loaded
- 05 Jun 1998
- Master ID
- 00500238 S & H Re: Dr. Olkin - Stanford Statistician - 'multi-Variate Logistic Risk Function
- 00500239 Dr. Ingram Olkin 'multivariate Logistic Risk Function'
- 00500242 Proposed Special Project Dr. Olkin - Stanford Statistician 'multivariate Logistic Risk Function' (A Study of Models Used in the Analysis of Certain Medical Data)
- 00500245-0251 Vita of Ingram Olkin
Page 1: wpd61e00
A Study of the Models Used in the Analysis of Certain i4edical Data Ingram 01kin Stanford University 1. An Overview In the early years of statistical analysis.of biological data, a number of single-variable models were studied in great detail - e.g., the normal, logistic, gamma, negative_binomial, etc. More recently, with the advent of computer technology and because most data involves many variables, it has become more common to use multivariate models, rather than univariate models. This led to the development. of multivariate analogues of well known univariate distributions. It is important to recognize that a single, unique multivariate extension to a univariate model does not exist. In fact, there may be many s'uch distributions. Very little research has been done to compare some of the possible multivariate exten- sions. Yet, each such extension could yield different interpre- tations of actual data. The multivariate normal distribution is the most widely accepted extension of the univariate normal distribution. Several multi- variate extensions of the exponential distribution, the gamma dis- tribution and the beta distribution have been studied by Marshall and 01kin (1967a), (1967b), and by 01kin (1964). In the context of medical studies, Cornfield (1962) provides a multiple cross- classification model. A general survey of multivariate distri- butions is provided by Johnson and Kotz (1972). _ The purpose of the present study is to review this general ' field of multivariate models and their relation to epidemiological studies. 2. Some Detailed Comments -- The Logistic Model . In studies such as the Frareingham heart-disease study a common problem is to describe the way in which a set of variables X,... h influences a binary variable Y. Here each X. represents a des- n. Z criptive health statistic (such as cholesterol), while Y assumes the value 1 if the patient contracts the disease, or 0 if not. On the assumption that the two populations (for Y=O and Y=1) are nor-,:ially distributed, the risk is described by a logistic function. This function has also been widely used by biologists studying gro:l-ths of populations and species interaction and by economists. Unfortunately the assumption of normality is rarely satisfied, even approximately. Hence other models describing the influence of the X variables on Y must be considered. 0 There are several possible approaches for modification and ~E';_LLIralization. One approach is to continue along the lines which lead tc the logistic model i-,ithout the assumption of normality.
Page 2: wpd61e00
r Another approach is that the logistic is a distribution in one variable and hence a multivariate logistic may be the correct form for several variables. A few multivariate logistic distributions have been proposed, but little is known about their properties. Finally, because"the risk of contracting a disease should increase with the level of certain health measurements (such as cholesterol count) this fact should be taken into account when examining the data. One method for doing this is called isotonic regression and nay prove quite useful in analyzing the data of the Framingham heart study. - The problem of constructing general joirit distributions of variables is an important one. The key point is how to build into the model the dependency between the variables. Obviously, if one chooses convenient mathematical functions, the result may not conform to reality. Thus, one needs to abstract properties from applications and to then construct a connection that maintains these properties. Another procedure is to use a physical model to generate the dependence. Both -procedures need to be investigated in developing a model for a jo.int distribution. Part II, Supplement No. 11, 58-61. - 2. Johnson, N. L. and Kotz, S. (1972) Distributions in-Statistics: Continuous Multivariate Distribution. 3. Marshall, A. W. and.0lkin, I. (1967) A multivariate exponeaLtial distribution. J. Amer. Statist. Assoc. 30-44. r - 4. Marshall, A. W. and 01kin, I. (1967) A generalized bivariate exponential distribution. 3. Applied Prob. 291-302. 5. 01kin, I. pendence (1964) Multivariate beta distributions and inde- properties of the Wishart distribution. Ann.Math. Statist. 35, 261-269. l. Cornfield, J. (1962) Joint dependence of risk of coronary heart disease on serum cholesterol and systolic blood pressure- a discriminant function analysis. Federation Proceedings 21,