•  112
    We present B_BL, a nine-valued bilateral modal logic based on the bilattice NINE. We then construct a concrete computational model of BBL for factuality evaluation of large language models (LLMs). The model defines an accessibility relation representing meaning-preserving syntactic variation of queries asking an LLM to provide verification and refutation for a given proposition, and a valuation function grounded in the responses to such queries. An experimental study of the valuation function ac…Read more
  •  350
    Neurosymbolic Knowledge Engineering with Natural Language
    Dissertation, University of Amsterdam. 2026.
    Since the nineteen-seventies, knowledge engineering as a discipline has struggled with the implementation problem: the difficulty of translating expert knowledge expressed in natural language into a formal knowledge representation to be adopted by organizations and communities for use in automated decision making. This knowledge acquisition bottleneck remains a fundamental barrier. We argue that large language models (LLMs) provide a means to address the implementation problem, by allowing knowl…Read more
  •  278
    We present Elenchus, a dialogue system for knowledge base construction grounded in inferentialist semantics, where knowledge engineering is re-conceived as explicitation rather than extraction from expert testimony or textual content. A human expert develops a bilateral position (commitments and denials) about a topic through prover-skeptic dialogue with a large language model (LLM) opponent. The LLM proposes tensions (claims that parts of the position are jointly incoherent) which the expert re…Read more
  •  478
    Sound and Complete Neuro-symbolic Reasoning with LLM-Grounded Interpretations
    with Prateek Chhikara, Thomas Macaulay Ferguson, Filip Ilievski, and Paul Groth
    In Leilani Gilpin, Eleonora Giunchiglia, Pascal Hitzler & Emile van Krieken (eds.), Proceedings of 19th Conference on Neurosymbolic Learning and Reasoning, Proceedings of Machine Learning Research. forthcoming.
    Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but they exhibit problems with logical consistency in the output they generate. How can we harness LLMs' broad-coverage parametric knowledge in formal reasoning despite their inconsistency? We present a method for directly integrating an LLM into the interpretation function of the formal semantics for a paraconsistent logic. We provide experimental evidence for the feasibility…Read more
  •  545
    Formalizations serve as cognitive tools. By enabling algorithmic reasoning over sets of statements in a formal language, they provide a cognitive boost for human reasoners. We argue that the emergence of large language models (LLMs) as a technology for the analysis and generation of natural language provides a new perspective on the relative roles of formal and natural languages in formalization.
  •  435
    A Benchmark for the Detection of Metalinguistic Disagreements between LLMs and Knowledge Graphs
    with Paul Groth
    In Reham Alharbi, Jacopo de Berardinis, Paul Groth, Albert Meroño-Peñuela, Elena Simperl & Valentina Tamma (eds.), ISWC 2024 Special Session on Harmonising Generative AI and Semantic Web Technologies, Ceur-ws. forthcoming.
    Evaluating large language models (LLMs) for tasks like fact extraction in support of knowledge graph construction frequently involves computing accuracy metrics using a ground truth benchmark based on a knowledge graph (KG). These evaluations assume that errors represent factual disagreements. However, human discourse frequently features metalinguistic disagreement, where agents differ not on facts but on the meaning of the language used to express them. Given the complexity of natural language …Read more
  •  513
    In his 1955 essay "Meaning and synonymy in natural languages", Rudolf Carnap presents a thought experiment wherein an investigator provides a hypothetical robot with a definition of a concept together with a description of an individual, and then asks the robot if the individual is in the extension of the concept. In this work, we show how to realize Carnap's Robot through knowledge probing of an large language model (LLM), and argue that this provides a useful cognitive tool for conceptual engi…Read more
  •  1246
    Conceptual Engineering Using Large Language Models
    In Vincent C. Müller, Leonard Dung, Guido Löhr & Aliya Rumana (eds.), Philosophy of Artificial Intelligence: The State of the Art, Springernature. 2026.
    We describe a method, based on Jennifer Nado’s proposal for classification procedures as targets of conceptual engineering, that implements such procedures by prompting a large language model. We apply this method, using data from the Wikidata knowledge graph, to evaluate stipulative definitions related to two paradigmatic conceptual engineering projects: the International Astronomical Union’s redefinition of PLANET and Haslanger’s ameliorative analysis of WOMAN. Our results show that classifica…Read more