Noncanonical Links in Generalized Linear Models - When is the Effort Justified?
Beschreibung
vor 28 Jahren
Generalized linear models (GLM) allow for a wide range of
statistical models for regression data. In particular, the logistic
model is usually applied for binomial observations. Canonical links
for GLM's such as the logit link in the binomial case, are often
used because in this case sufficient statistics for the regression
parameter exist which allow for simple interpretation of the
results. However, in some applications, the overall fit as measured
by the p-values of goodness of fit statistics (as the residual
deviance) can be improved significantly by the use of a
noncanonical link. In this case, the interpretation of the
influence of the covariables is more complicated compared to GLM's
with canonical link functions. It will be illustrated through
simulation that the p-value associated with the common goodness of
link tests is not appropriate to quantify the changes to mean
response estimates and other quantities of interest when switching
to a noncanonical link. In particular, the rate of
misspecifications becomes considerably large, when the inverse
information value associated with the underlying parametric link
model increases. This shows that the classical tests are often too
sensitive, in particular, when the number of observations is large.
The consideration of a generalized p-value function is proposed
instead, which allows the exact quantification of a suitable
distance to the canonical model at a controlled error rate.
Corresponding tests for validating or discriminating the canonical
model can easily performed by means of this function.
statistical models for regression data. In particular, the logistic
model is usually applied for binomial observations. Canonical links
for GLM's such as the logit link in the binomial case, are often
used because in this case sufficient statistics for the regression
parameter exist which allow for simple interpretation of the
results. However, in some applications, the overall fit as measured
by the p-values of goodness of fit statistics (as the residual
deviance) can be improved significantly by the use of a
noncanonical link. In this case, the interpretation of the
influence of the covariables is more complicated compared to GLM's
with canonical link functions. It will be illustrated through
simulation that the p-value associated with the common goodness of
link tests is not appropriate to quantify the changes to mean
response estimates and other quantities of interest when switching
to a noncanonical link. In particular, the rate of
misspecifications becomes considerably large, when the inverse
information value associated with the underlying parametric link
model increases. This shows that the classical tests are often too
sensitive, in particular, when the number of observations is large.
The consideration of a generalized p-value function is proposed
instead, which allows the exact quantification of a suitable
distance to the canonical model at a controlled error rate.
Corresponding tests for validating or discriminating the canonical
model can easily performed by means of this function.
Weitere Episoden
In Podcasts werben
Kommentare (0)