On the Greek island of Kalythos the male inhabitants suffer from a
congenital eye disease, the effects of which become more marked with
increasing age. Samples of islander males of various ages were tested for
blindness and the results recorded. The data is shown in
Table
.
The problem we consider is to fit both logistic and probit models to this data, and to estimate for each model the LD50, that is the age at which the chance of blindness for a male inhabitant is 50%.
If y is the number of blind at age x and n the number tested, both
models have the form
where for the probit case,
is the standard normal
distribution function, and in the logit case, (the default), F(z) =
ez/(1+ez) . In both cases the LD50 is
that is, the point at which the argument of the distribution function is
zero.
The first step is to set the data up as a data frame
kalythos <- data.frame(x=c(20,35,45,55,70), n=rep(50,5),
y=c(6,17,26,37,44))
To fit a binomial model using glm() there are two possibilities for the response:
Here we need the second of these conventions, so we add a matrix to our data frame:
kalythos$Ymat <- cbind(kalythos$y, kalythos$n - kalythos$y)
To fit the models we use
fmp <- glm( Ymatx, family=binomial(link=probit), data=kalythos)
fml <- glm( Ymatx, family=binomial, data=kalythos)
Since the logit link is the default the parameter may be omitted on the second call. To see the results of each fit we could use
summary(fmp)
summary(fml)
Both models fit (all too) well. To find the LD50 estimate we can use a simple function:
ld50 <- function(b) -b[1]/b[2]
ldp <- ld50(coef(fmp)); ldl <- ld50(coef(fmp)); c(ldp, ldl)
The actual estimates from this data are 43.663 years and 43.601 years respectively.