Blacksburg, Virginia, United States of America
  •  110
    After some general remarks about the interrelation between philosophical and statistical thinking, the discussion centres largely on significance tests. These are defined as the calculation of p-values rather than as formal procedures for ‘acceptance‘ and ‘rejection‘. A number of types of null hypothesis are described and a principle for evidential interpretation set out governing the implications of p- values in the specific circumstances of each application, as contrasted with a long-run inter…Read more
  •  181
    Ontology & Methodology
    Synthese 192 (11): 3413-3423. 2015.
    Philosophers of science have long been concerned with the question of what a given scientific theory tells us about the contents of the world, but relatively little attention has been paid to how we set out to build theories and to the relevance of pre-theoretical methodology on a theory’s interpretation. In the traditional view, the form and content of a mature theory can be separated from any tentative ontological assumptions that went into its development. For this reason, the target of inter…Read more
  •  71
    Severe Testing: Error Statistics versus Bayes Factor Tests
    British Journal for the Philosophy of Science. forthcoming.
  •  193
    I argue that the Bayesian Way of reconstructing Duhem's problem fails to advance a solution to the problem of which of a group of hypotheses ought to be rejected or "blamed" when experiment disagrees with prediction. But scientists do regularly tackle and often enough solve Duhemian problems. When they do, they employ a logic and methodology which may be called error statistics. I discuss the key properties of this approach which enable it to split off the task of testing auxiliary hypotheses fr…Read more
  • 'Peirce-pectives' on Metaphysics and the Sciences
    with Susan Haack, Rosa Mayorga, Jaime Nubiola, Cornelis de Waal, Robert G. Meyers, Joseph C. Pitt, and Nicholas Rescher
    Transactions of the Charles S. Peirce Society 41 (2): 237-365. 2005.
  •  33
    Science, Error Statistics, and Arguing from Error
    Poznan Studies in the Philosophy of the Sciences and the Humanities 71 95-111. 2000.
  •  21
    Toward a More Objective Understanding of the Evidence of Carcinogenic Risk
    PSA Proceedings of the Biennial Meeting of the Philosophy of Science Association 1988 (2): 489-503. 1988.
    The field of quantified risk assessment is a new field, only about 20 years old, and already it is considered to be in a crisis. As Funtowicz and J.R. Ravetz (1985) put it:The concept of risk in terms of probability has proved to be so elusive, and statistical inference so problematic, that many experts in the field have recently either lost hope of finding a scientific solution or lost faith in Risk Analysis as a tool for decisionmaking. (p.219)Thus the ‘art’ of the assessment of risks… is at a…Read more
  •  44
    Error and the Growth of Experimental Knowledge
    University of Chicago. 1996.
    This text provides a critique of the subjective Bayesian view of statistical inference, and proposes the author's own error-statistical approach as an alternative framework for the epistemology of experiment. It seeks to address the needs of researchers who work with statistical analysis.
  •  22
    Cartwright, Causality, and Coincidence
    PSA Proceedings of the Biennial Meeting of the Philosophy of Science Association 1986 (1): 42-58. 1986.
    In How the Laws of Physics Lie (1983)2 Cartwright argues for being a realist about theoretical entities but non-realist about theoretical laws. Her reason for this distinction is that only the former involves causal explanation, and accepting causal explanations commits us to the existence of the causal entity invoked. “What is special about explanation by theoretical entity is that it is causal explanation, and existence is an internal characteristic of causal claims. There is nothing similar f…Read more
  •  94
    Some surprising facts about surprising facts
    Studies in History and Philosophy of Science Part A 45 79-86. 2014.
    A common intuition about evidence is that if data x have been used to construct a hypothesis H, then x should not be used again in support of H. It is no surprise that x fits H, if H was deliberately constructed to accord with x. The question of when and why we should avoid such “double-counting” continues to be debated in philosophy and statistics. It arises as a prohibition against data mining, hunting for significance, tuning on the signal, and ad hoc hypotheses, and as a preference for prede…Read more
  •  123
    While the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools…Read more
  •  278
    How to discount double-counting when it counts: Some clarifications
    British Journal for the Philosophy of Science 59 (4): 857-879. 2008.
    The issues of double-counting, use-constructing, and selection effects have long been the subject of debate in the philosophical as well as statistical literature. I have argued that it is the severity, stringency, or probativeness of the test—or lack of it—that should determine if a double-use of data is admissible. Hitchcock and Sober ([2004]) question whether this ‘severity criterion' can perform its intended job. I argue that their criticisms stem from a flawed interpretation of the severity…Read more
  •  55
  •  108
    Significance Tests: Vitiated or Vindicated by the Replication Crisis in Psychology?
    Review of Philosophy and Psychology 12 (1): 101-120. 2020.
    The crisis of replication has led many to blame statistical significance tests for making it too easy to find impressive looking effects that do not replicate. However, the very fact it becomes difficult to replicate effects when features of the tests are tied down actually serves to vindicate statistical significance tests. While statistical significance tests, used correctly, serve to bound the probabilities of erroneous interpretations of data, this error control is nullified by data-dredging…Read more
  •  297
    Methodology in Practice: Statistical Misspecification Testing
    Philosophy of Science 71 (5): 1007-1025. 2004.
    The growing availability of computer power and statistical software has greatly increased the ease with which practitioners apply statistical methods, but this has not been accompanied by attention to checking the assumptions on which these methods are based. At the same time, disagreements about inferences based on statistical research frequently revolve around whether the assumptions are actually met in the studies available, e.g., in psychology, ecology, biology, risk assessment. Philosophica…Read more
  •  49
    Acceptable Evidence (edited book)
    with Rachelle D. Hollander
    Oxford University Press USA. 1994.
    Discussions of science and values in risk management have largely focused on how values enter into arguments about risks, that is, issues of acceptable risk. Instead this volume concentrates on how values enter into collecting, interpreting, communicating, and evaluating the evidence of risks, that is, issues of the acceptability of evidence of risk. By focusing on acceptable evidence, this volume avoids two barriers to progress. One barrier assumes that evidence of risk is largely a matter of o…Read more
  •  57
    About Thinking (review)
    Teaching Philosophy 5 (1): 80-83. 1982.
  •  172
    Some methodological issues in experimental economics
    Philosophy of Science 75 (5): 633-645. 2008.
    The growing acceptance and success of experimental economics has increased the interest of researchers in tackling philosophical and methodological challenges to which their work increasingly gives rise. I sketch some general issues that call for the combined expertise of experimental economists and philosophers of science, of experiment, and of inductive‐statistical inference and modeling. †To contact the author, please write to: 235 Major Williams, Virginia Tech, Blacksburg, VA 24061‐0126; e‐m…Read more
  •  139
    Response to Howson and Laudan
    Philosophy of Science 64 (2): 323-333. 1997.
    A toast is due to one who slays Misguided followers of Bayes, And in their heart strikes fear and terror With probabilities of error! (E.L. Lehmann)
  •  280
    Novel evidence and severe tests
    Philosophy of Science 58 (4): 523-552. 1991.
    While many philosophers of science have accorded special evidential significance to tests whose results are "novel facts", there continues to be disagreement over both the definition of novelty and why it should matter. The view of novelty favored by Giere, Lakatos, Worrall and many others is that of use-novelty: An accordance between evidence e and hypothesis h provides a genuine test of h only if e is not used in h's construction. I argue that what lies behind the intuition that novelty matter…Read more
  •  487
    Experimental practice and an error statistical account of evidence
    Philosophy of Science 67 (3): 207. 2000.
    In seeking general accounts of evidence, confirmation, or inference, philosophers have looked to logical relationships between evidence and hypotheses. Such logics of evidential relationship, whether hypothetico-deductive, Bayesian, or instantiationist fail to capture or be relevant to scientific practice. They require information that scientists do not generally have (e.g., an exhaustive set of hypotheses), while lacking slots within which to include considerations to which scientists regularly…Read more
  •  225
    I document some of the main evidence showing that E. S. Pearson rejected the key features of the behavioral-decision philosophy that became associated with the Neyman-Pearson Theory of statistics (NPT). I argue that NPT principles arose not out of behavioral aims, where the concern is solely with behaving correctly sufficiently often in some long run, but out of the epistemological aim of learning about causes of experimental results (e.g., distinguishing genuine from spurious effects). The view…Read more
  •  105
    The Philosophical Relevance of Statistics
    PSA: Proceedings of the Biennial Meeting of the Philosophy of Science Association 1980. 1980.
    While philosophers have studied probability and induction, statistics has not received the kind of philosophical attention mathematics and physics have. Despite increasing use of statistics in science, statistical advances have been little noted in the philosophy of science literature. This paper shows the relevance of statistics to both theoretical and applied problems of philosophy. It begins by discussing the relevance of statistics to the problem of induction and then discusses the reasoning…Read more
  •  142
    Error and the Growth of Experimental Knowledge
    with Michael Kruse
    Philosophical Review 107 (2): 324. 1998.
    Once upon a time, logic was the philosopher’s tool for analyzing scientific reasoning. Nowadays, probability and statistics have largely replaced logic, and their most popular application—Bayesianism—has replaced the qualitative deductive relationship between a hypothesis h and evidence e with a quantitative measure of h’s probability in light of e.
  •  649
    Severe testing as a basic concept in a neyman–pearson philosophy of induction
    British Journal for the Philosophy of Science 57 (2): 323-357. 2006.
    Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and long-standing problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test's (pre-data) error probabilities are to be used for (post-data) inductive inference as opposed to inductive behavior. We ar…Read more