Prompting is now a dominant method for evaluating the linguistic knowledge of large language models (LLMs). While other methods directly read out models’ probability distributions over strings, prompting requires models to access this internal information by processing linguistic input, thereby implicitly testing a new type of emergent ability: metalinguistic judgment. In this study, we compare metalinguistic prompting and direct probability measurements as ways of measuring models’ knowledge of English. Broadly, we find that LLMs’ metalinguistic judgments are inferior to quantities directly derived from representations. Furthermore, consistency gets worse as the prompt diverges from direct measurements of next-word probabilities. Our findings suggest that negative results relying on metalinguistic prompts cannot be taken as conclusive evidence that an LLM lacks a particular linguistic competence. Our results also highlight the lost value with the move to closed APIs where access to probability distributions is limited.
@manuscript{hu_prompt-based_2023,author={Hu, Jennifer and Levy, Roger},title={Prompt-based methods may underestimate large language models' linguistic generalizations},url={https://lingbuzz.net/lingbuzz/007313},year={2023},code={https://github.com/jennhu/metalinguistic-prompting}}
2022
Pragmatics in Grounded Language Learning: Phenomena, Tasks, and Modeling Approaches
Daniel Fried,
Nicholas Tomlin,
Jennifer Hu,
Roma Patel,
and Aida Nematzadeh
People rely heavily on context to enrich meaning beyond what is literally said, enabling concise but effective communication. To interact successfully and naturally with people, user-facing artificial intelligence systems will require similar skills in pragmatics: relying on various types of context – from shared linguistic goals and conventions, to the visual and embodied world – to use language effectively. We survey existing grounded settings and pragmatic modeling approaches and analyze how the task goals, environmental contexts, and communicative affordances in each work enrich linguistic meaning. We present recommendations for future grounded task design to naturally elicit pragmatic phenomena, and suggest directions that focus on a broader range of communicative contexts and affordances.
@manuscript{fried_pragmatics_2022,title={Pragmatics in {Grounded} {Language} {Learning}: {Phenomena}, {Tasks}, and {Modeling} {Approaches}},url={https://arxiv.org/abs/2211.08371},author={Fried, Daniel and Tomlin, Nicholas and Hu, Jennifer and Patel, Roma and Nematzadeh, Aida},year={2022}}
2020
A Rate–Distortion view of human pragmatic reasoning
What computational principles underlie human pragmatic reasoning? A prominent approach to pragmatics is the Rational Speech Act (RSA) framework, which formulates pragmatic reasoning as probabilistic speakers and listeners recursively reasoning about each other. While RSA enjoys broad empirical support, it is not yet clear whether the dynamics of such recursive reasoning may be governed by a general optimization principle. Here, we present a novel analysis of the RSA framework that addresses this question. First, we show that RSA recursion implements an alternating maximization for optimizing a tradeoff between expected utility and communicative effort. On that basis, we study the dynamics of RSA recursion and disconfirm the conjecture that expected utility is guaranteed to improve with recursion depth. Second, we show that RSA can be grounded in Rate-Distortion theory, while maintaining a similar ability to account for human behavior and avoiding a bias of RSA toward random utterance production. This work furthers the mathematical understanding of RSA models, and suggests that general information-theoretic principles may give rise to human pragmatic reasoning.
@manuscript{zaslavsky_rate--distortion_2020,author={Zaslavsky, Noga and Hu, Jennifer and Levy, Roger},title={A Rate--Distortion view of human pragmatic reasoning},year={2020},url={https://arxiv.org/abs/2005.06641}}
Journal articles
2023
Expectations over unspoken alternatives predict pragmatic inferences
Jennifer Hu,
Roger Levy,
Judith Degen,
and Sebastian Schuster
Transactions of the Association for Computational Linguistics
@article{hu_expectations_2023,author={Hu, Jennifer and Levy, Roger and Degen, Judith and Schuster, Sebastian},title={Expectations over unspoken alternatives predict pragmatic inferences},journal={Transactions of the Association for Computational Linguistics},url={https://arxiv.org/abs/2304.04758},year={2023},code={https://github.com/jennhu/expectations-over-alternatives}}
2022
Precision fMRI reveals that the language-selective network supports both phrase-structure building and lexical access during language production
Jennifer Hu,
Hannah Small,
Hope Kean,
Atsushi Takahashi,
Leo Zelekman,
Daniel Kleinman,
Elizabeth Ryan,
Alfonso Nieto-Castañón,
Victor Ferreira,
and Evelina Fedorenko
A fronto-temporal brain network has long been implicated in language comprehension. However, this network’s role in language production remains debated. In particular, it remains unclear whether all or only some language regions contribute to production, and which aspects of production these regions support. Across three fMRI experiments that rely on robust individual-subject analyses, we characterize the language network’s response to high-level production demands. We report three novel results. First, sentence production, spoken or typed, elicits a strong response throughout the language network. Second, the language network responds to both phrase-structure building and lexical access demands, although the response to phrase-structure building is stronger and more spatially extensive, present in every language region. Finally, contra some proposals, we find no evidence of brain regions—within or outside the language network—that selectively support phrase-structure building in production relative to comprehension. Instead, all language regions respond more strongly during production than comprehension, suggesting that production incurs a greater cost for the language network. Together, these results align with the idea that language comprehension and production draw on the same knowledge representations, which are stored in a distributed manner within the language-selective network and are used to both interpret and generate linguistic utterances.
@article{hu_language_2022,author={Hu, Jennifer and Small, Hannah and Kean, Hope and Takahashi, Atsushi and Zelekman, Leo and Kleinman, Daniel and Ryan, Elizabeth and Nieto-Castañón, Alfonso and Ferreira, Victor and Fedorenko, Evelina},title={Precision fMRI reveals that the language-selective network supports both phrase-structure building and lexical access during language production},year={2022},journal={Cerebral Cortex},url={https://academic.oup.com/cercor/advance-article-abstract/doi/10.1093/cercor/bhac350/6706753},code={https://github.com/jennhu/LanguageProduction/}}
Book chapters
2023
Learning syntactic structures from string input
Ethan Wilcox,
Jon Gauthier,
Jennifer Hu,
Peng Qian,
and Roger Levy
@incollection{wilcox_learning_2023,title={Learning syntactic structures from string input},booktitle={Algebraic {Structures} in {Natural} {Language}},publisher={Taylor \& Francis},author={Wilcox, Ethan and Gauthier, Jon and Hu, Jennifer and Qian, Peng and Levy, Roger},editor={Lappin, Shalom and Bernardy, Jean-Philippe},year={2023},note={In press}}
Conference proceedings
2023
An AI Dungeon Master’s Guide: Learning to Converse and Guide with Intents and Theory-of-Mind in Dungeons and Dragons
Pei Zhou,
Andrew Zhu,
Jennifer Hu,
Jay Pujara,
Xiang Ren,
Chris Callison-Burch,
Yejin Choi,
and Prithviraj Ammanabrolu
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics
We propose a novel task, G4C (Goal-driven Guidance Generation in Grounded Communication), for studying goal-driven and grounded natural language interactions. Specifically, we choose Dungeons and Dragons (D&D) – a role-playing game consisting of multiple player characters and a Dungeon Master (DM) who collaborate to achieve a set of goals that are beneficial to the players – as a testbed for this task. Here, each of the player characters is a student, with their own personas and abilities, and the DM is the teacher, an arbitrator of the rules of the world and responsible for assisting and guiding the students towards a global goal. We propose a theory-of-mind-inspired methodology for training such a DM with reinforcement learning (RL), where a DM: (1) learns to predict how the players will react to its utterances using a dataset of D&D dialogue transcripts; and (2) uses this prediction as a reward function providing feedback on how effective these utterances are at guiding the players towards a goal. Human and automated evaluations show that a DM trained with RL to generate guidance by incorporating a theory-of-mind of the players significantly improves the players’ ability to achieve goals grounded in their shared world.
@inproceedings{zhou_ai_2023,title={An {AI} {Dungeon} {Master}'s {Guide}: {Learning} to {Converse} and {Guide} with {Intents} and {Theory}-of-{Mind} in {Dungeons} and {Dragons}},url={https://arxiv.org/abs/2212.10060},booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics},author={Zhou, Pei and Zhu, Andrew and Hu, Jennifer and Pujara, Jay and Ren, Xiang and Callison-Burch, Chris and Choi, Yejin and Ammanabrolu, Prithviraj},year={2023},note={To appear}}
2023
A fine-grained comparison of pragmatic language understanding in humans and language models
Jennifer Hu,
Sammy Floyd,
Olessia Jouravlev,
Evelina Fedorenko,
and Edward Gibson
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics
Pragmatics is an essential part of communication, but it remains unclear what mechanisms underlie human pragmatic communication and whether NLP systems capture pragmatic language understanding. To investigate both these questions, we perform a fine-grained comparison of language models and humans on seven pragmatic phenomena, using zero-shot prompting on an expert-curated set of English materials. We ask whether models (1) select pragmatic interpretations of speaker utterances, (2) make similar error patterns as humans, and (3) use similar linguistic cues as humans to solve the tasks. We find that the largest models achieve high accuracy and match human error patterns: within incorrect responses, models favor the literal interpretation of an utterance over heuristic-based distractors. We also find evidence that models and humans are sensitive to similar linguistic cues. Our results suggest that even paradigmatic pragmatic phenomena may be solved without explicit representations of other agents’ mental states, and that artificial models can be used to gain mechanistic insights into human pragmatic processing.
@inproceedings{hu_fine-grained_2023,title={A fine-grained comparison of pragmatic language understanding in humans and language models},url={https://arxiv.org/abs/2212.06801},code={https://github.com/jennhu/lm-pragmatics},author={Hu, Jennifer and Floyd, Sammy and Jouravlev, Olessia and Fedorenko, Evelina and Gibson, Edward},year={2023},booktitle={Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics},note={To appear}}
2022
Teasing apart models of pragmatics using optimal reference game design
Irene Zhou,
Jennifer Hu,
Roger Levy,
and Noga Zaslavsky
How do humans produce and comprehend language in pragmatic ways? A variety of models of pragmatic inferences have been proposed, and these models are typically evaluated on their ability to account for human inferences in reference game experiments. However, these experiments are not tailored to target theoretical differences between models or clearly tease apart model predictions. We propose an optimal experiment design approach to systematically construct reference games that can optimally differentiate between models of human pragmatic reasoning. We demonstrate this approach and apply it to four models that have been debated in the literature: Grammar-based, Iterated Best Response (IBR), Rational Speech Act (RSA), and a recent variant of RSA grounded in rate-distortion theory (RD-RSA). Using these optimal reference game experiments, we find empirical evidence favoring iterated rationality models over the grammar-based model, as well as support for the relevance of rate-distortion theory to human pragmatic inferences. These results suggest that our optimal reference game design framework may help adjudicate between computational theories of pragmatic reasoning.
@inproceedings{zhou_teasing_2022,author={Zhou, Irene and Hu, Jennifer and Levy, Roger and Zaslavsky, Noga},title={Teasing apart models of pragmatics using optimal reference game design},booktitle={Proceedings of the Cognitive Science Society},url={https://escholarship.org/uc/item/7mr9809j},year={2022}}
2021
Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese Language Models
Yiwen Wang,
Jennifer Hu,
Roger Levy,
and Peng Qian
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Prior work has shown that structural supervision helps English language models learn generalizations about syntactic phenomena such as subject-verb agreement. However, it remains unclear if such an inductive bias would also improve language models’ ability to learn grammatical dependencies in typologically different languages. Here we investigate this question in Mandarin Chinese, which has a logographic, largely syllable-based writing system, different word order, and sparser morphology than English. We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and Transformer-parameterized generative parsing models on two Mandarin Chinese datasets of different sizes. We evaluate the models’ ability to learn different aspects of Mandarin grammar that assess syntactic and semantic relationships. We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low-data settings, suggesting that the benefits of hierarchical inductive biases in acquiring dependency relationships may extend beyond English.
@inproceedings{wang_controlled_2021,author={Wang, Yiwen and Hu, Jennifer and Levy, Roger and Qian, Peng},title={Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese Language Models},booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},year={2021},url={https://arxiv.org/abs/2109.11058}}
2021
Competition from novel features drives scalar inferences in reference games
Scalar implicatures, one of the signatures of pragmatic reasoning, are believed to arise from competing alternative utterances, which the listener knows that the speaker could have used to express a strengthened meaning. But do scalar implicatures also arise in the presence of nonce objects, for which no alternative name is known? We conduct a series of experiments assessing the degree of scalar strengthening driven by familiar and nonce objects. We find that nonce objects can derive scalar implicatures as strongly as familiar objects in simple reference games. Our experiments also reveal an asymmetry in the relative strengths of familiar- and nonce-driven inferences: relative to the prior, participants preferentially interpret the name of a shared feature as referring to an object with an additional nonce feature over an object with an additional familiar feature, suggesting that familiar alternatives exert greater scalar pressure than nonce alternatives. We also present exploratory model simulations suggesting that our results may be explained by rationally reasoning about a high-cost description of the novel object. Our findings support the idea that novel lexical entries may be generated from one-shot encounters and spontaneously used in pragmatic inference.
@inproceedings{hu_competition_2021,author={Hu, Jennifer and Zaslavsky, Noga and Levy, Roger},title={Competition from novel features drives scalar inferences in reference games},booktitle={Proceedings of the Cognitive Science Society},year={2021},code={https://github.com/jennhu/nonce-SI},url={https://escholarship.org/uc/item/8jx5h8sn}}
2020
On the predictive power of neural language models for human real-time comprehension behavior
Ethan Wilcox,
Jon Gauthier,
Jennifer Hu,
Peng Qian,
and Roger Levy
@inproceedings{wilcox_predictive_2020,author={Wilcox, Ethan and Gauthier, Jon and Hu, Jennifer and Qian, Peng and Levy, Roger},title={On the predictive power of neural language models for human real-time comprehension behavior},booktitle={Proceedings of the Cognitive Science Society},year={2020},url={https://arxiv.org/abs/2006.01912},code={https://github.com/wilcoxeg/neural-networks-read-times}}
2020
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer Hu,
Jon Gauthier,
Peng Qian,
Ethan Wilcox,
and Roger Levy
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
While state-of-the-art neural network models continue to achieve lower perplexity scores on language modeling benchmarks, it remains unknown whether optimizing for broad-coverage predictive performance leads to human-like syntactic knowledge. Furthermore, existing work has not provided a clear picture about the model properties required to produce proper syntactic generalizations. We present a systematic evaluation of the syntactic knowledge of neural language models, testing 20 combinations of model types and data sizes on a set of 34 English-language syntactic test suites. We find substantial differences in syntactic generalization performance by model architecture, with sequential models underperforming other architectures. Factorially manipulating model architecture and training dataset size (1M-40M words), we find that variability in syntactic generalization performance is substantially greater by architecture than by dataset size for the corpora tested in our experiments. Our results also reveal a dissociation between perplexity and syntactic generalization performance.
@inproceedings{hu-etal-2020-systematic,title={A Systematic Assessment of Syntactic Generalization in Neural Language Models},author={Hu, Jennifer and Gauthier, Jon and Qian, Peng and Wilcox, Ethan and Levy, Roger},booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},month=jul,year={2020},address={Online},publisher={Association for Computational Linguistics},url={https://www.aclweb.org/anthology/2020.acl-main.158},pages={1725--1744},code={https://github.com/cpllab/syntactic-generalization}}
2020
SyntaxGym: An Online Platform for Targeted Evaluation of Language Models
Jon Gauthier,
Jennifer Hu,
Ethan Wilcox,
Peng Qian,
and Roger Levy
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations
Targeted syntactic evaluations have yielded insights into the generalizations learned by neural network language models. However, this line of research requires an uncommon confluence of skills: both the theoretical knowledge needed to design controlled psycholinguistic experiments, and the technical proficiency needed to train and deploy large-scale language models. We present SyntaxGym, an online platform designed to make targeted evaluations accessible to both experts in NLP and linguistics, reproducible across computing environments, and standardized following the norms of psycholinguistic experimental design. This paper releases two tools of independent value for the computational linguistics community: 1. A website, syntaxgym.org, which centralizes the process of targeted syntactic evaluation and provides easy tools for analysis and visualization; 2. Two command-line tools, ‘syntaxgym‘ and ‘lm-zoo‘, which allow any user to reproduce targeted syntactic evaluations and general language model inference on their own machine.
@inproceedings{gauthier-etal-2020-syntaxgym,title={{S}yntax{G}ym: An Online Platform for Targeted Evaluation of Language Models},author={Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger},booktitle={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations},month=jul,year={2020},address={Online},publisher={Association for Computational Linguistics},url={https://www.aclweb.org/anthology/2020.acl-demos.10},pages={70--76},website={http://syntaxgym.org/}}
2020
A closer look at the performance of neural language models on reflexive anaphor licensing
Jennifer Hu,
Sherry Yong Chen,
and Roger Levy
Proceedings of the Society for Computation in Linguistics
@inproceedings{hu_closer_2020,author={Hu, Jennifer and Chen, Sherry Yong and Levy, Roger},title={A closer look at the performance of neural language models on reflexive anaphor licensing},booktitle={Proceedings of the Society for Computation in Linguistics},volume={3},pages={382-392},year={2020},url={https://scholarworks.umass.edu/scil/vol3/iss1/37/},code={https://github.com/jennhu/reflexive-anaphor-licensing}}
2019
Separating object resonance and room reverberation in impact sounds
Everyday hearing requires inferring the causal factors that produce a sound, as when we separate the acoustic effects of the environment (reverberation) from those of sound sources. Here we consider perceptual inferences from impact sounds, in which the resonance of a struck object provides cues to its material, but via acoustic effects that might be nontrivial to disentangle from reverberation. We investigated whether and how humans separate the effects of object resonance and reverberation in a material classification task. For comparison, we implemented a Bayesian observer that inferred material from a generative model of object sounds without reverberation. Humans were robust to reverberation, whereas the model was not. However, human robustness was specific to reverberation consistent with the statistics of natural environments. The results suggest that humans use internal models of room and object acoustics to determine their respective contributions to sound, providing an example of causal inference in audition.
@inproceedings{hu_separating_2019,author={Hu, Jennifer and Traer, James and McDermott, Josh H.},title={Separating object resonance and room reverberation in impact sounds},booktitle={Proceedings of the Cognitive Science Society},year={2019}}
2018
Generating Bilingual Pragmatic Color References
Will Monroe,
Jennifer Hu,
Andrew Jong,
and Christopher Potts
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
Contextual influences on language often exhibit substantial cross-lingual regularities; for example, we are more verbose in situations that require finer distinctions. However, these regularities are sometimes obscured by semantic and syntactic differences. Using a newly-collected dataset of color reference games in Mandarin Chinese (which we release to the public), we confirm that a variety of constructions display the same sensitivity to contextual difficulty in Chinese and English. We then show that a neural speaker agent trained on bilingual data with a simple multitask learning approach displays more human-like patterns of context dependence and is more pragmatically informative than its monolingual Chinese counterpart. Moreover, this is not at the expense of language-specific semantic understanding: the resulting speaker model learns the different basic color term systems of English and Chinese (with noteworthy cross-lingual influences), and it can identify synonyms between the two languages using vector analogy operations on its output layer, despite having no exposure to parallel data.
@inproceedings{monroe-etal-2018-generating,title={Generating Bilingual Pragmatic Color References},author={Monroe, Will and Hu, Jennifer and Jong, Andrew and Potts, Christopher},booktitle={Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)},month=jun,year={2018},address={New Orleans, Louisiana},publisher={Association for Computational Linguistics},url={https://aclanthology.org/N18-1196},doi={10.18653/v1/N18-1196},pages={2155--2165},code={https://github.com/futurulus/colors-in-context},data={https://cocolab.stanford.edu/datasets/colors.html}}
Workshop papers
2022
Predicting scalar diversity with context-driven uncertainty over alternatives
Jennifer Hu,
Roger Levy,
and Sebastian Schuster
ACL Workshop on Cognitive Modeling and Computational Linguistics
Scalar implicature (SI) arises when a speaker uses an expression (e.g., "some") that is semantically compatible with a logically stronger alternative on the same scale (e.g., "all"), leading the listener to infer that they did not intend to convey the stronger meaning. Prior work has demonstrated that SI rates are highly variable across scales, raising the question of what factors determine the SI strength for a particular scale. Here, we test the hypothesis that SI rates depend on the listener’s confidence in the underlying scale, which we operationalize as uncertainty over the distribution of possible alternatives conditioned on the context. We use a T5 model fine-tuned on a text infilling task to estimate the distribution over context-conditioned alternatives. We measure uncertainty both over the sampled alternatives themselves and over latent clusters among alternatives in sentence embedding space. We find that both scale uncertainty measures predict human SI rates, suggesting that SI depends on listeners’ context-driven uncertainty over alternatives.
2021
Scalable pragmatic communication via self-supervision
Jennifer Hu,
Roger Levy,
and Noga Zaslavsky
ICML Workshop on Self-Supervised Learning for Reasoning and Perception
Models of context-sensitive communication often use the Rational Speech Act framework (RSA; Frank & Goodman, 2012), which formulates listeners and speakers in a cooperative reasoning process. However, the standard RSA formulation can only be applied to small domains, and large-scale applications have relied on imitating human behavior. Here, we propose a new approach to scalable pragmatics, building upon recent theoretical results (Zaslavsky et al., 2020) that characterize pragmatic reasoning in terms of general information-theoretic principles. Specifically, we propose an architecture and learning process in which agents acquire pragmatic policies via self-supervision instead of imitating human data. This work suggests a new principled approach for equipping artificial agents with pragmatic skills via self-supervision, which is grounded both in pragmatic theory and in information theory.
Abstracts
2022
Facilitative Effect Induced by Classifier-Noun Mismatch in Mandarin Chinese
Yiwen Wang,
Jennifer Hu,
Roger Levy,
and Peng Qian
The 35th Annual Conference on Human Sentence Processing
2022
Predicting scalar diversity with context-driven expectations
Jennifer Hu,
Roger Levy,
and Sebastian Schuster
Proceedings of the Experimental Pragmatics Conference (XPRAG)
2021
A Rate–Distortion view of human pragmatic reasoning
Noga Zaslavsky,
Jennifer Hu,
and Roger Levy
Proceedings of the Society for Computation in Linguistics
2021
Empirical support for a Rate–Distortion account of pragmatic reasoning
Irene Zhou,
Jennifer Hu,
Roger Levy,
and Noga Zaslavsky
Proceedings of the Cognitive Science Society
2020
Distributed and overlapping neural mechanisms for lexical access and syntactic encoding during language production
Jennifer Hu,
Hannah Small,
Hope Kean,
Atsushi Takahashi,
Leo Zekelman,
Daniel Kleinman,
Elizabeth Ryan,
Victor Ferreira,
and Evelina Fedorenko
Proceedings of the Society for the Neurobiology of Language
2020
Benchmarking neural networks as models of human language processing
Ethan Wilcox,
Jon Gauthier,
Jennifer Hu,
Peng Qian,
and Roger Levy
Proceedings of the 26th Architectures and Mechanisms for Language Processing Conference
2020
Evaluating the effect of model inductive bias and training data in predicting human reading times
Ethan Wilcox,
Jon Gauthier,
Peng Qian,
Jennifer Hu,
and Roger Levy
Proceedings of the 33rd Annual CUNY Human Sentence Processing Conference
2020
Emergence of pragmatic reasoning from least-effort optimization
Noga Zaslavsky,
Jennifer Hu,
and Roger Levy
Proceedings of Evolution of Language International Conferences
2018
A graph-theoretic approach to comparing typologies in Parallel OT and Harmonic Serialism
Jennifer Hu
Proceedings of the 92nd Annual Meeting of the Linguistic Society of America
Posters
2020
Evaluating the effect of model inductive bias and training data in predicting human reading times
Ethan Wilcox,
Jennifer Hu,
Jon Gauthier,
and Roger Levy
MIT-IBM Watson AI Lab Poster Session
2019
SyntaxGym: A unified platform for psycholinguistic assessment of neural language models
Ethan Wilcox,
Jennifer Hu,
Jon Gauthier,
and Roger Levy
MIT-IBM Watson AI Lab Research Week Colloquium
2017
Comparing patterns of pragmatic reasoning in Chinese and English color reference games
Jennifer Hu,
Andrew Jong,
Will Monroe,
and Christopher Potts
Council on Undergraduate Research 2017 Research Experiences for Undergraduates Symposium