According to influential accounts of scientific method, such as critical rationalism, scientific knowledge grows by repeatedly testing our best hypotheses. But despite the popularity of hypothesis tests in statistical inference and science in general, their philosophical foundations remain shaky. In particular, the interpretation of non-significant results—those that do not reject the tested hypothesis—poses a major philosophical challenge. To what extent do they corroborate the tested hypothesi…
Read moreAccording to influential accounts of scientific method, such as critical rationalism, scientific knowledge grows by repeatedly testing our best hypotheses. But despite the popularity of hypothesis tests in statistical inference and science in general, their philosophical foundations remain shaky. In particular, the interpretation of non-significant results—those that do not reject the tested hypothesis—poses a major philosophical challenge. To what extent do they corroborate the tested hypothesis, or provide a reason to accept it? Popper sought for measures of corroboration that could adequately answer this question. According to Popper, corroboration is different from probability-raising, and grounded in the predictive success and testability of a hypothesis. As such, corroboration becomes an indicator of the scientific value of a hypothesis and guides our practical preferences over hypotheses that have been subjected to severe tests. This article proves two impossibility results for corroboration measures based on statistical relevance. The generality of these results shows that Popper’s qualitative characterization of corroboration must be misguided. I explore what a more promising and scientifically useful concept of corroboration could look like.