•  111
    How should billions of species observations worldwide be shared and made reusable? Many biodiversity scientists assume the ideal solution is to standardize all datasets according to a single, universal classification and aggregate them into a centralized, global repository. This ideal has known practical and theoretical limitations, however, which justifies investigating alternatives. To support better community deliberation and normative evaluation, we develop a novel conceptual framework showi…Read more
  •  1498
    Criticism of big data has focused on showing that more is not necessarily better, in the sense that data may lose their value when taken out of context and aggregated together. The next step is to incorporate an awareness of pitfalls for aggregation into the design of data infrastructure and institutions. A common strategy minimizes aggregation errors by increasing the precision of our conventions for identifying and classifying data. As a counterpoint, we argue that there are pragmatic trade-of…Read more
  •  71
    Bats, objectivity, and viral spillover risk
    with Beckett Sterner, Steve Elliott, and Nate Upham
    History and Philosophy of the Life Sciences 43 (1): 1-5. 2021.
    What should the best practices be for modeling zoonotic disease risks, e.g. to anticipate the next pandemic, when background assumptions are unsettled or evolving rapidly? This challenge runs deeper than one might expect, all the way into how we model the robustness of contemporary phylogenetic inference and taxonomic classifications. Different and legitimate taxonomic assumptions can destabilize the putative objectivity of zoonotic risk assessments, thus potentially supporting inconsistent and …Read more
  •  113
    The collection and classification of data into meaningful categories is a key step in the process of knowledge making. In the life sciences, the design of data discovery and integration tools has relied on the premise that a formal classificatory system for expressing a body of data should be grounded in consensus definitions for classifications. On this approach, exemplified by the realist program of the Open Biomedical Ontologies Foundry, progress is maximized by grounding the representation a…Read more
  •  152
    Big data is opening new angles on old questions about scientific progress. Is scientific knowledge cumulative? If yes, how does it make progress? In the life sciences, what we call the Consensus Principle has dominated the design of data discovery and integration tools: the design of a formal classificatory system for expressing a body of data should be grounded in consensus. Based on current approaches in biomedicine and systematic biology, we formulate and compare three types of the Consensus …Read more
  •  210
    Outline of an explanatory account of cladistic practice
    Biology and Philosophy 20 (2-3): 489-515. 2005.
    A naturalistic account of the strengths and limitations of cladistic practice is offered. The success of cladistics is claimed to be largely rooted in the parsimony-implementing congruence test. Cladists may use the congruence test to iteratively refine assessments of homology, and thereby increase the odds of reliable phylogenetic inference under parsimony. This explanation challenges alternative views which tend to ignore the effects of parsimony on the process of character individuation in sy…Read more