[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: re unrelated males in household
Will Johnson's contribution is perhaps expressed in unfamiliar terms.
1. In large data sets the problem is usually that substantively weak
relationships will nevertheless often reach conventional levels of
statistical significance. The "Michelin Guide" approach of reporting more
and more stringent levels with more and more stars is beside the point: it
is far more appropriate to report a PRE (Proportinte reduction in error)
measure such as an odds ratio or correlation coefficient.
2. It is of course true that "When the rate of occurrence of an event is
very low ... [few] variables ... willl ... have a statistically significant
relationship ... This is because of the low rate of of occurrrence..."
3. In established Mathematical usage a constant function is one that is
invariant under some defined conditions, whereas a variable function is one
that takes different forms or vslues under different conditions. For
instance, the well known relationship between unemployment and inflation may
dffer between countries or between historical periods. Child death has a
very low incidence, but nevertheless the incidence is a function of many
factors. If all factors work in the same way in all relevant circumstances
we could say death is a constant function of these factors. If no factors
have any effect then death is not a constant function but a constant.
4. In practical terms, the biggest problem, as has been pointed out by
several contributors, is the inevitable high rate of false positives when
the outcome variable has low incidence. At the risk of appearing to be
falling into senescent reminiscence, I first came across this in the
writings of Sheldon and Eleanor Glueck, in the mid-60s, who totally failed
to understand the point despite repeated attempts on the part of eminent
methodologists. If I come across a reference to this I ahall post it --
unless someone beats me to it.