Large language models (LLMs) produce fluent, persuasive answers even when the interaction provides no adequate reason to accept them. The risk falls on the human side: overcommitment to claims that the exchange does not support. To assess it, we propose Turing's mirror, a structural inversion of the Turing test. Where the Turing test fixes a restricted dialogue and asks whether a machine can pass for a human, Turing's mirror keeps the same roles, dialogue, and evaluation but reverses the target:…
Read moreLarge language models (LLMs) produce fluent, persuasive answers even when the interaction provides no adequate reason to accept them. The risk falls on the human side: overcommitment to claims that the exchange does not support. To assess it, we propose Turing's mirror, a structural inversion of the Turing test. Where the Turing test fixes a restricted dialogue and asks whether a machine can pass for a human, Turing's mirror keeps the same roles, dialogue, and evaluation but reverses the target: the human subject is the examinee, the LLM becomes part of the apparatus, and the question is whether the subject's expressed commitment tracks the warrant that the interaction makes available. Unlike adjacent work on automation bias and appropriate reliance, which records whether people accept or reject suggestions, this inversion makes the calibration of commitment to available warrant the measured quantity, a target we call protocol-relative warranted commitment. We distinguish settings in which suspension of judgment is responsible from settings in which some commitment must be expressed under uncertainty, and we introduce a light formal schema that separates truth in the world from what a protocol makes decisively settleable. On that basis, we specify a family of trial types, a graded report over support for truth, support for falsity, and unresolved commitment, and a limited role for proper scoring. Worked examples show why a subject may be correct without being warranted, and cautious without thereby failing. The result is a disciplined starting point for studying warrant-sensitive reliance on fluent LLM output.