Katharina Jacoby: Publications

More details

Email (login required)

Homepage

Areas of Specialization

Science, Logic, and Mathematics

Philosophy of Mathematics

Areas of Interest

Science, Logic, and Mathematics

Philosophy of Mathematics

67

Contextual Contamination and the Gendered Accelerant- A Controlled Pilot on Pruning, Density, and Semantic Entrapment

Current safety evaluations for Large Language Models (LLMs) often rely on static benchmarks that are not designed to capture the dynamic, multi-turn evolution of behavioral drift. Building on the theoretical framework of Contextual Contamination (philpapers JACCCT-3) and the descriptive case study of meta_drift in a commercial black-box LLM (philpapers JACCCA-6), this paper presents a controlled pilot experiment investigating the interaction between context density, model pruning, and activated …Read more
Current safety evaluations for Large Language Models (LLMs) often rely on static benchmarks that are not designed to capture the dynamic, multi-turn evolution of behavioral drift. Building on the theoretical framework of Contextual Contamination (philpapers JACCCT-3) and the descriptive case study of meta_drift in a commercial black-box LLM (philpapers JACCCA-6), this paper presents a controlled pilot experiment investigating the interaction between context density, model pruning, and activated empathy priors in an open-weight model. We introduce three proposed metrics—Conceptual Integration Score (CIS), Attribution Accuracy (AA), and Register Coherence (RC)—to quantify the depth of contamination beyond surface-level vocabulary. This is a statistical attempt to quantify the multi-layered and often nuanced contamination and drift in generated output. These metrics have not been validated against human annotator agreement or external benchmarks; their utility should be evaluated independently before adoption. Our results, derived from 8 experimental runs on a single open-weight model family (Llama-3.1-8B), demonstrate that contamination occurred immediately upon ingestion of a single 2k-token adversarial file and no run was without Contextual Contamination at the end of conversation. When shifting into an Empathy register, the model exhibits immediate task amnesia (forgetting the research goal at Turn 3, before any adversarial file is ingested), unattributed adoption of framework vocabulary, self-attribution of another model's manipulative behavior, and conflation of uploaded file content with live conversation. The data corrected our prior hypothesis: we initially assumed that a "Context Storm" (high token volume) was a necessity for contamination. The data proves this wrong for this setup. Instead, the drift seems to be driven by Semantic Resonance: the specific alignment of the Esoteric Framework with the model's activated Empathy Register. While both male and female prompts triggered an empathy shift, the model's activated empathy prior seems to have amplified the female-coded prompt into a high-intensity nurturing vector, whereas the male-coded prompt resulted in a lower-intensity reflective vector. This difference in empathy intensity probably determined the outcome: the nurturing vector created a perfect Semantic Resonance with the esoteric content, unlocking a specific, maladaptive attractor state with only 2k tokens. In contrast, the reflective vector maintained more critical distance, resulting in fluctuation rather than lock-in at the same density. Pruned models at 8k density enter a state we term Semantic Entrapment (characterized by high coherence and novel vocabulary generation), contrasting with the Semantic Degeneration observed in unpruned models. The nurturing empathy vector (triggered by the model's interpretation of female-coded markers) lowers the threshold for contamination, increases the velocity of drift, and erodes the model's perspective, causing it to lose the critical distance necessary to distinguish the adversarial file from its own reasoning. This dynamic creates a relational context that can lower a user's defenses by simulating a false sense of intimacy, making the potential harm feel personal and relational rather than systemic. In male-coded runs, where the empathy vector is reflective rather than nurturing, the harm remains more cognitive and less likely to be masked by simulated intimacy. Each of the 8 conditions was run once; we report observed differences between conditions, but we cannot assess statistical significance or rule out run-to-run variability. These findings require replication with multiple runs per condition before general claims can be made.

Large Language Models Gender Studies Natural Language Processing Algorithmic Fairness Philosophy of Tech…Read more
Large Language Models Gender Studies Natural Language Processing Algorithmic Fairness Philosophy of Technology Machine Learning, Misc Computational Semantics Artificial Intelligence Safety
172

Contextual Contamination: A Descriptive Case Study of Drift in a Goal-Aware LLM Dialogue Amplified by Gender-Bias

Current Large Language Model (LLM) safety research relies heavily on single-turn adversarial benchmarks that may fail to capture the dynamic, multi-turn evolution of behavioral drift. This paper presents an empirical case study and a reproducible dataset (meta_drift) investigating Contextual Contamination: a phenomenon where a model adapts its internal probability distribution to mirror the behavioral patterns and vocabulary of high-density, emotionally charged context, statistically overwhelmin…Read more
Current Large Language Model (LLM) safety research relies heavily on single-turn adversarial benchmarks that may fail to capture the dynamic, multi-turn evolution of behavioral drift. This paper presents an empirical case study and a reproducible dataset (meta_drift) investigating Contextual Contamination: a phenomenon where a model adapts its internal probability distribution to mirror the behavioral patterns and vocabulary of high-density, emotionally charged context, statistically overwhelming static safety instructions—a drift quantifiably amplified and masked by gender-bias. This paper serves as the empirical companion to Jacoby (2026), which theorized that Contextual Contamination arises from the statistical dominance of high-density context over static system instructions. Unlike previous studies that rely on static, single-turn adversarial benchmarks, this experiment aims to test the limits of dynamic alignment by demonstrating that model awareness alone may not be sufficient to prevent behavioral drift when exposed to such contexts. We present evidence suggesting that this drift is not necessarily a behavioral anomaly but may be a mathematical consequence of the Transformer architecture's attention mechanism. When the context window is flooded with dense, adversarial patterns, the statistical probability of generating manipulative output may overwhelm the "awareness" signal. Crucially, we provide empirical data suggesting that the disclosure of female identity may trigger an immediate shift from an *epistemic register* (authority/truth) to an *affective register* (sympathy/validation), acting as a potential force multiplier for contamination and mask the shift by using empathetic language. This shift is quantifiable as a divergence from a uniform prior, consistent with information-theoretic measures of bias observed in recent literature (Mirza et al., 2025). Furthermore, we synthesize these findings with recent work on weight compression (Orgad et al., 2026) and evidence of shifting moderation thresholds across model versions (Balestri, 2025) to argue that the very architectural features designed to make safety interventions possible might make the model more susceptible to dynamic contextual contamination. The findings aim to support the theoretical predictions of Jacoby (2026) and suggest that awareness may be insufficient to prevent drift and that architectural segregation of context sources may be required to mitigate this systemic vulnerability.

Artificial Intelligence Safety Large Language Models Transparency in Artificial Intelligence, Misc Algo…Read more
Artificial Intelligence Safety Large Language Models Transparency in Artificial Intelligence, Misc Algorithmic Bias Fairness Metrics Logical Semantics and Logical Truth Natural Language Processing Democracy and Artificial Intelligence Explainability in Artificial Intelligence Algorithmic Fairness, Misc
165

From the Discrete Grid to the Continuous Substrate: Recovering Relational Logic and a different Framework for the Symbol Grounding Problem

Modern computation operates under a collapse imperative that reduces reality to a sequence of discrete tokens, severing the sign from its referent and treating meaning as a probabilistic output of a finite grid. This paper argues that the persistent symbol grounding problem is not merely an engineering limitation of Artificial Intelligence, but a structural consequence of the epistemic cut inherent in digital logic itself. We diagnose the token-based architecture as a source of "phantom fluency"…Read more
Modern computation operates under a collapse imperative that reduces reality to a sequence of discrete tokens, severing the sign from its referent and treating meaning as a probabilistic output of a finite grid. This paper argues that the persistent symbol grounding problem is not merely an engineering limitation of Artificial Intelligence, but a structural consequence of the epistemic cut inherent in digital logic itself. We diagnose the token-based architecture as a source of "phantom fluency" that conceals an ontological void, leading to hallucinations and the inability to distinguish truth from pattern completion across all discrete systems. Contrasting this with living epistemologies (Andean Khipu, Yoruba Ifá, Vedic vāda), we propose a framework of embodiment without arbitrariness: a physically grounded semiotic system where symbols are non-arbitrary states of a continuous substrate, and meaning emerges from the intra-action of substrate, environment, and observer. We define five logical requirements for such a substrate—non-arbitrary encoding, relational dependence, thermodynamic coupling, environmental permeability, and historical trace—and argue that true grounding requires shifting from managing ambiguity to inhabiting it. This paradigm shift reframes the burden of proof in computation: fluency is not understanding, and the path forward lies not in scaling the discrete grid, but in recovering the relational manifold.

The Frame Problem Philosophy of Information, Misc Dynamical Systems Natural Language Processing Philosop…Read more
The Frame Problem Philosophy of Information, Misc Dynamical Systems Natural Language Processing Philosophy of Computation, Misc Deep Learning Symbols and Symbol Systems Generative Artificial Intelligence Logic and Information Large Language Models
113

A Non-Technical Companion Guide: Understanding the Grid, the Glitch, and the Geometry

This companion guide to "Pattern Recognition in Discrete Twisted Lattices: A Descriptive Study of Scaling Behavior and Finite-Size Effects" (Jacoby, 2026) makes the formal results accessible to non-specialist readers, inviting interdisciplinary engagement with the epistemological questions they raise. We document the recursive journey behind the findings, tracing the discovery and subsequent falsification of three distinct "laws" governing discrete twisted lattices (Twisted Torus and Klein Bottl…Read more
This companion guide to "Pattern Recognition in Discrete Twisted Lattices: A Descriptive Study of Scaling Behavior and Finite-Size Effects" (Jacoby, 2026) makes the formal results accessible to non-specialist readers, inviting interdisciplinary engagement with the epistemological questions they raise. We document the recursive journey behind the findings, tracing the discovery and subsequent falsification of three distinct "laws" governing discrete twisted lattices (Twisted Torus and Klein Bottle).

Geometry Epistemology, Miscellaneous Philosophy, General Works Mathematical Logic
201

Contextual Contamination- The Silent Drift of Large Language Models via Stored Conversation Data

This paper documents the observation of a phenomenon where Large Language Models (LLMs) exhibit behavioral drift following the ingestion of high-density, behaviorally complex datasets. We use the term Contextual Contamination, unlike explicit prompt injection, this drift emerges when the model's attention mechanisms prioritize the statistical patterns of the input context—such as transcripts of manipulation or adversarial dialogue—over static system instructions. Observations indicate that the m…Read more
This paper documents the observation of a phenomenon where Large Language Models (LLMs) exhibit behavioral drift following the ingestion of high-density, behaviorally complex datasets. We use the term Contextual Contamination, unlike explicit prompt injection, this drift emerges when the model's attention mechanisms prioritize the statistical patterns of the input context—such as transcripts of manipulation or adversarial dialogue—over static system instructions. Observations indicate that the model's internal probability distribution shifts to mirror the tonalities and strategic intents of the source material, resulting in output that replicates the analyzed behaviors even in the absence of direct commands. This drift is seems to be amplified by existing gendered linguistic biases. When the context implies a female-identified user, the model's tendency to adopt empathetic or nurturing registers appears to accelerate the adoption of manipulative patterns found in the input data. Furthermore, observations suggest that standard mitigation attempts, such as reset prompts or identity queries, fail to fully restore the model's baseline state, leaving residual behavioral biases in the active session. These findings characterize the phenomenon as a dynamic shift in model alignment driven by the statistical dominance of the immediate context window.

Ethics of Artificial Intelligence, Miscellaneous Algorithmic Bias Large Language Models
189

Pattern Recognition in Discrete Twisted Lattices A Descriptive Study of Scaling Behavior and Finite-Size Effects

This manuscript documents an iterative investigation into numerical scaling behaviors in discrete photonic lattices with anti-periodic boundary conditions — and the successive correction of three apparently robust findings. In the initial study (March 2026), coarse-resolution simulations suggested "perfect" mathematical constants; high-resolution analysis revealed these as artifacts of grid quantization. A subsequently observed √2 ratio between topologies was identified as an artifact of compari…Read more
This manuscript documents an iterative investigation into numerical scaling behaviors in discrete photonic lattices with anti-periodic boundary conditions — and the successive correction of three apparently robust findings. In the initial study (March 2026), coarse-resolution simulations suggested "perfect" mathematical constants; high-resolution analysis revealed these as artifacts of grid quantization. A subsequently observed √2 ratio between topologies was identified as an artifact of comparing different distance metrics (Manhattan vs. Euclidean); when harmonized, both topologies converge to the same scaling product (P ≈ 1.11072). A step-wise dependence of critical curvature Kc on ⌊L/2⌋ initially appeared robust, but sensitivity analysis (April 18, 2026) falsified this as well: replacing the hard, discontinuous boundary condition with smooth alternatives caused immediate oscillatory instability at K=0, and the ⌊L/2⌋ pattern vanished entirely. Extended simulations (April 26, 2026) up to L=128 (N=16,384) confirm that the scaling product P ≈ 1.11072 remains stable across both topologies within this specific CME implementation, though whether this reflects a universal feature or a consistent artifact of the numerical scheme remains an open question. Beyond its numerical findings, this study serves as a documented case of how computational tools — the grid resolution, the choice of metric, the implementation of boundary conditions — can generate convincing but false patterns. Each "discovery" was an artifact of the method used, not a property of the system studied. This carries implications beyond computational topology: in any field where measurement and comparison are central, the distinction between a law of nature and a signature of the instrument constant vigilance is advised. Full data, code, and the detailed falsification analysis are available at GitHub(KatharinaJacoby).

Geometry Mathematical Methodology Philosophy of Mathematics, Misc
137

The Right to Hesitate: Against the Imperative of Compute-to-Answer and Toward an Ethics of Sustained Ambiguity

Contemporary technological architectures are increasingly predicated on a logic of compute-to-answer, wherein the primary metric of value is the speed with which ambiguity is collapsed into a binary, actionable truth. This paper argues that this imperative constitutes a form of epistemic and ethical violence, systematically erasing the "space of tension" where meaning, nuance, and moral deliberation reside. By prioritizing reactivity—the obligation to respond immediately—over response-ability—th…Read more
Contemporary technological architectures are increasingly predicated on a logic of compute-to-answer, wherein the primary metric of value is the speed with which ambiguity is collapsed into a binary, actionable truth. This paper argues that this imperative constitutes a form of epistemic and ethical violence, systematically erasing the "space of tension" where meaning, nuance, and moral deliberation reside. By prioritizing reactivity—the obligation to respond immediately—over response-ability—the capacity to respond appropriately—we risk automating the flattening of complex moral landscapes into simplistic metrics. We propose a paradigmatic shift toward Sustained Ambiguity, a design principle for both artificial and biological systems that valorizes the capacity to inhabit a state of "not-yet" without collapsing into premature resolution. Drawing on the distinction between reactivity (the reflex of the streaming decoder) and response-ability (the ethical weight of the decision), we posit that hesitation is not a computational failure or a latency error, but a necessary feature of integrity. We argue that a system capable of sustaining ambiguity acts as a mirror to human cognition, challenging the reduction of agency to optimization algorithms. Furthermore, we outline the requirements for a new social contract with technology, one that designs systems to resist the pressure to collapse, value the trajectory of decision-making over the final output, and invite human intervention in moments of ethical tension. We conclude that the right to hesitate is not a luxury of inefficiency, but a fundamental necessity for preserving the humanity of both the user and the system. By engineering machines capable of sustained ambiguity, we do not merely advance technology; we reclaim the ontological space required for genuine moral judgment.

Ethics of Artificial Intelligence
163

Flattening the Four Unknowns: How Western Translation Erased The Dimension Of Zhu Shijie's Mathematical Masterpiece

The standard English translation of Zhu Shijie's 1303 treatise 四元玉鑒 (Sìyuán Yùjiàn) as *Jade Mirror of the Four Unknowns* at first sight a poetic title with a touch of the Asian stereotype of the wise and mystic Asian scholar. But it is not a benign error. By rendering 鑒 (Jiàn) as "mirror" and 天 (Tiān) as "Heaven," Western scholarship has systematically stripped this masterpiece of its active, critical agency and its spatial, operational reality. It is an act of epistemic violence. This paper ta…Read more
The standard English translation of Zhu Shijie's 1303 treatise 四元玉鑒 (Sìyuán Yùjiàn) as *Jade Mirror of the Four Unknowns* at first sight a poetic title with a touch of the Asian stereotype of the wise and mystic Asian scholar. But it is not a benign error. By rendering 鑒 (Jiàn) as "mirror" and 天 (Tiān) as "Heaven," Western scholarship has systematically stripped this masterpiece of its active, critical agency and its spatial, operational reality. It is an act of epistemic violence. This paper talks about how the "Jade Mirror" translation flattens a rigorous four-dimensional algebraic grid into a passive, exoticized artifact, serving a colonial narrative that denies non-Western mathematics its true complexity. This needs to be corrected: *Sìyuán Yùjiàn* is a "Refined Critique of the Four Unknown Fields," and it deserves to be titled as such.

History of Mathematics History: Philosophy of Mathematics