Jiangtian Li (University of Toronto, St. George Campus): Publications

29

Issues of Generalization From Unreliable or Unrepresentative Stimuli: Broad Lessons From Lexical Ambiguity Open Access
with Blair Armstrong

Open Mind 9. 2025.

The reliability and representativeness of the stimuli used in psychological experiments plays a critical role in the generalizability of their findings. To evaluate the potential impact of reliability and representativeness in psycholinguistics and the cognitive sciences more broadly, we conducted a case study using the domain of lexical ambiguity as a foil. We examined how often studies agreed on the ambiguity types assigned to a word (i.e., homonymy, polysemy, and monosemy), and how well the w…Read more
The reliability and representativeness of the stimuli used in psychological experiments plays a critical role in the generalizability of their findings. To evaluate the potential impact of reliability and representativeness in psycholinguistics and the cognitive sciences more broadly, we conducted a case study using the domain of lexical ambiguity as a foil. We examined how often studies agreed on the ambiguity types assigned to a word (i.e., homonymy, polysemy, and monosemy), and how well the words represented the populations underlying each ambiguity type. These analyses involved 3597 unique words (14792 tokens) from 240 studies. We observed that (1) there is substantial, albeit imperfect agreement in words being assigned to ambiguity types; (2) that coverage of the underlying populations is relatively poor and biased, with substantial re-use of some stimuli across studies; (3) some clusters of studies engage in substantial stimulus re-use, which although beneficial in some respects, may impact generalizability; and (4) in a series of pseudo-experiments, the aforementioned issues of reliability and representativeness could conceivably alter the reported patterns of effects observed in lexical decision, a popular experimental task. Taken together, our findings raise questions about issues of reliability and generalizability that could impact prior theoretical claims. We discuss our findings with respect to specific considerations related to lexical ambiguity, such as the challenge of ambiguity type labeling, as well as broader considerations relevant to the cognitive sciences, such as the theoretical basis for generalizing, and how we optimize the trade-off between replication and generalization. We close by offering targeted directions to improve research practices.
Probing the Representational Structure of Regular Polysemy in a Contextual Word Embedding Model via Sense Analogy Questions
with Blair Armstrong

In M. Goldwater, F. K. Anggoro, B. K. Hayes & D. C. Ong (eds.), Proceedings of the 45th Meeting of the Cognitive Science Society, . pp. 348-355. 2023.

Regular polysemes are sets of ambiguous words that all share the same relationship between their meanings, such as CHICKEN and LOBSTER both referring to an animal or its meat. To probe how a context embedding model, here exemplified by BERT, represents regular polysemy, we analyzed whether its embeddings support answering sense analogy questions similar to “is the mapping be- tween CHICKEN (as an animal) and CHICKEN (as a meat) the same as that which maps between LOBSTER (as an animal) to LOBSTE…Read more
Regular polysemes are sets of ambiguous words that all share the same relationship between their meanings, such as CHICKEN and LOBSTER both referring to an animal or its meat. To probe how a context embedding model, here exemplified by BERT, represents regular polysemy, we analyzed whether its embeddings support answering sense analogy questions similar to “is the mapping be- tween CHICKEN (as an animal) and CHICKEN (as a meat) the same as that which maps between LOBSTER (as an animal) to LOBSTER (as a meat)?” We found that (1) the model was sensitive to the shared structure within a regularity type; (2) the shared structure varies across regularity types, potentially reflective of a “regularity continuum;” (3) some high-order latent structure may be shared across regularity types, suggestive of a similar la- tent structure across types; and (4) there is equivocal ev- idence that the aforementioned effects are explained by meaning overlap.

Philosophy of Language Computational Semantics Cognitive Sciences
19

Issues of Generalization from Unreliable or Unrepresentative Psycholinguistic Stimuli: A Case Study on Lexical Ambiguity
with Blair Armstrong

In Larissa Samuelson, Stefan Frank, Mariya Toneva, Allyson Mackey & Eliot Hazeltine (eds.), Proceedings of the 46th Annual Conference of the Cognitive Science Society, Cc By. pp. 1249-1256. 2024.

We conducted a case study on how unreliable and/or unrepresentative stimuli in psycholinguistics research may impact the generalizability of experimental findings. Using the domain of lexical ambiguity as a foil, we analyzed 2033 unique words (6481 tokens) from 214 studies. Specifically, we examined how often studies agreed on the ambiguity types assigned to a word (i.e., homonymy, polysemy, and monosemy), and how well the words represented the populations underlying each ambiguity type. We obse…Read more
We conducted a case study on how unreliable and/or unrepresentative stimuli in psycholinguistics research may impact the generalizability of experimental findings. Using the domain of lexical ambiguity as a foil, we analyzed 2033 unique words (6481 tokens) from 214 studies. Specifically, we examined how often studies agreed on the ambiguity types assigned to a word (i.e., homonymy, polysemy, and monosemy), and how well the words represented the populations underlying each ambiguity type. We observed far from perfect agreement in terms of how words are assigned to ambiguity types. We also observed that coverage of the populations is relatively poor and biased, leading to the use of a narrower set of words and associated properties. This raises concerns about the degree to which prior theoretical claims have strong empirical support, and offers targeted directions to improve research practices that are relevant to a broad set of domains.
106

Semantic minimalism and the continuous nature of polysemy
Mind and Language 39 (5): 680-705. 2024.

Polysemy has recently emerged as a popular topic in philosophy of language. While much existing research focuses on the relatedness among senses, this article introduces a novel perspective that emphasizes the continuity of sense individuation, sense regularity, and sense productivity. This new perspective has only recently gained traction, largely due to advancements in computational linguistics. It also poses a serious challenge to semantic minimalism, so I present three arguments against mini…Read more
Polysemy has recently emerged as a popular topic in philosophy of language. While much existing research focuses on the relatedness among senses, this article introduces a novel perspective that emphasizes the continuity of sense individuation, sense regularity, and sense productivity. This new perspective has only recently gained traction, largely due to advancements in computational linguistics. It also poses a serious challenge to semantic minimalism, so I present three arguments against minimalism from the continuous perspective that touch on the minimal concept, the distinction from homonymy, and the quasi‐rule‐like nature of polysemy. Last, I provide an account of polysemy that incorporates this continuous perspective.

Semantic Minimalism
92

Probing the Representational Structure of Regular Polysemy via Sense Analogy Questions: Insights from Contextual Word Vectors
with Blair C. Armstrong

Cognitive Science 48 (3). 2024.

Regular polysemes are sets of ambiguous words that all share the same relationship between their meanings, such as CHICKEN and LOBSTER both referring to an animal or its meat. To probe how a distributional semantic model, here exemplified by bidirectional encoder representations from transformers (BERT), represents regular polysemy, we analyzed whether its embeddings support answering sense analogy questions similar to “is the mapping between CHICKEN (as an animal) and CHICKEN (as a meat) simila…Read more
Regular polysemes are sets of ambiguous words that all share the same relationship between their meanings, such as CHICKEN and LOBSTER both referring to an animal or its meat. To probe how a distributional semantic model, here exemplified by bidirectional encoder representations from transformers (BERT), represents regular polysemy, we analyzed whether its embeddings support answering sense analogy questions similar to “is the mapping between CHICKEN (as an animal) and CHICKEN (as a meat) similar to that which maps between LOBSTER (as an animal) to LOBSTER (as a meat)?” We did so using the LRcos model, which combines a logistic regression classifier of different categories (e.g., animal vs. meat) with a measure of cosine similarity. We found that (a) the model was sensitive to the shared structure within a given regular relationship; (b) the shared structure varies across different regular relationships (e.g., animal/meat vs. location/organization), potentially reflective of a “regularity continuum;” (c) some high-order latent structure is shared across different regular relationships, suggestive of a similar latent structure across different types of relationships; and (d) there is a lack of evidence for the aforementioned effects being explained by meaning overlap. Lastly, we found that both components of the LRcos model made important contributions to accurate responding and that a variation of this method could yield an accuracy boost of 10% in answering sense analogy questions. These findings enrich previous theoretical work on regular polysemy with a computationally explicit theory and methods, and provide evidence for an important organizational principle for the mental lexicon and the broader conceptual knowledge system.

Philosophy of Cognitive Science
152

Word Senses as Clusters of Meaning Modulations: A Computational Model of Polysemy
with Marc F. Joanisse

Cognitive Science 45 (4). 2021.

Most words in natural languages are polysemous; that is, they have related but different meanings in different contexts. This one‐to‐many mapping of form to meaning presents a challenge to understanding how word meanings are learned, represented, and processed. Previous work has focused on solutions in which multiple static semantic representations are linked to a single word form, which fails to capture important generalizations about how polysemous words are used; in particular, the graded nat…Read more
Most words in natural languages are polysemous; that is, they have related but different meanings in different contexts. This one‐to‐many mapping of form to meaning presents a challenge to understanding how word meanings are learned, represented, and processed. Previous work has focused on solutions in which multiple static semantic representations are linked to a single word form, which fails to capture important generalizations about how polysemous words are used; in particular, the graded nature of polysemous senses, and the flexibility and regularity of polysemy use. We provide a novel view of how polysemous words are represented and processed, focusing on how meaning is modulated by context. Our theory is implemented within a recurrent neural network that learns distributional information through exposure to a large and representative corpus of English. Clusters of meaning emerge from how the model processes individual word forms. In keeping with distributional theories of semantics, we suggest word meanings are generalized from contexts of different word tokens, with polysemy emerging as multiple clusters of contextually modulated meanings. We validate our results against a human‐annotated corpus of polysemy focusing on the gradedness, flexibility, and regularity of polysemous sense individuation, as well as behavioral findings of offline sense relatedness ratings and online sentence processing. The results provide novel insights into how polysemy emerges from contextual processing of word meaning from both a theoretical and computational point of view.

Natural Language Processing Semantics Cognitive Psychology

Jiangtian Li

Issues of Generalization From Unreliable or Unrepresentative Stimuli: Broad Lessons From Lexical Ambiguity Open Access
with Blair Armstrong

Open Mind 9. 2025.

Probing the Representational Structure of Regular Polysemy in a Contextual Word Embedding Model via Sense Analogy Questions
with Blair Armstrong

In M. Goldwater, F. K. Anggoro, B. K. Hayes & D. C. Ong (eds.), Proceedings of the 45th Meeting of the Cognitive Science Society, . pp. 348-355. 2023.

Semantic minimalism and the continuous nature of polysemy
Mind and Language 39 (5): 680-705. 2024.

Probing the Representational Structure of Regular Polysemy via Sense Analogy Questions: Insights from Contextual Word Vectors
with Blair C. Armstrong

Cognitive Science 48 (3). 2024.

Word Senses as Clusters of Meaning Modulations: A Computational Model of Polysemy
with Marc F. Joanisse

Cognitive Science 45 (4). 2021.

Jiangtian Li

Issues of Generalization From Unreliable or Unrepresentative Stimuli: Broad Lessons From Lexical Ambiguity Open Access with Blair Armstrong Open Mind 9. 2025.

Probing the Representational Structure of Regular Polysemy in a Contextual Word Embedding Model via Sense Analogy Questions with Blair Armstrong In M. Goldwater, F. K. Anggoro, B. K. Hayes & D. C. Ong (eds.), Proceedings of the 45th Meeting of the Cognitive Science Society, . pp. 348-355. 2023.

Semantic minimalism and the continuous nature of polysemy Mind and Language 39 (5): 680-705. 2024.

Probing the Representational Structure of Regular Polysemy via Sense Analogy Questions: Insights from Contextual Word Vectors with Blair C. Armstrong Cognitive Science 48 (3). 2024.

Word Senses as Clusters of Meaning Modulations: A Computational Model of Polysemy with Marc F. Joanisse Cognitive Science 45 (4). 2021.

Issues of Generalization From Unreliable or Unrepresentative Stimuli: Broad Lessons From Lexical Ambiguity Open Access
with Blair Armstrong

Open Mind 9. 2025.

Probing the Representational Structure of Regular Polysemy in a Contextual Word Embedding Model via Sense Analogy Questions
with Blair Armstrong

In M. Goldwater, F. K. Anggoro, B. K. Hayes & D. C. Ong (eds.), Proceedings of the 45th Meeting of the Cognitive Science Society, . pp. 348-355. 2023.

Semantic minimalism and the continuous nature of polysemy
Mind and Language 39 (5): 680-705. 2024.

Probing the Representational Structure of Regular Polysemy via Sense Analogy Questions: Insights from Contextual Word Vectors
with Blair C. Armstrong

Cognitive Science 48 (3). 2024.

Word Senses as Clusters of Meaning Modulations: A Computational Model of Polysemy
with Marc F. Joanisse

Cognitive Science 45 (4). 2021.