A Framework for Leveraging the Interpretable Properties of Word Mover’s Distance in Sociocultural Analysis - Incite at Columbia University
-
Work
A Framework for Leveraging the Interpretable Properties of Word Mover’s Distance in Sociocultural Analysis
- Published November 11, 2021
- Authors Mikael Brunila Jack LaViolette
- Category Paper
- Forum ACL Anthology
- Link doi.org
Despite the increasing popularity of NLP in the humanities and social sciences, advances in model performance and complexity have been accompanied by concerns about interpretability and explanatory power for sociocultural analysis.
One popular model that takes a middle road is Word Mover’s Distance (WMD). Ostensibly adapted for its interpretability, WMD has nonetheless been used and further developed in ways which frequently discard its most interpretable aspect: namely, the word-level distances required for translating a set of words into another set of words. To address this apparent gap, we introduce WMDecompose: a model and Python library that 1) decomposes document-level distances into their constituent word-level distances, and 2) subsequently clusters words to induce thematic elements, such that useful lexical information is retained and summarized for analysis. To illustrate its potential in a social scientific context, we apply it to a longitudinal social media corpus to explore the interrelationship between conspiracy theories and conservative American discourses.
Finally, because of the full WMD model’s high time-complexity, we additionally suggest a method of sampling document pairs from large datasets in a reproducible way, with tight bounds that prevent extrapolation of unreliable results due to poor sampling practices.
Related Works
-
go to the Remembering life five years ago when COVID-19 stopped New York CityMar 2025Remembering life five years ago when COVID-19 stopped New York City Robert W. SnyderCity & State New York
-
go to the When the City Stopped: Stories from New York's Essential WorkersMar 2025When the City Stopped: Stories from New York's Essential Workers Robert W. SnyderCornell University Press
-
go to the How do you teach the art of listening?Feb 2025How do you teach the art of listening? Eve GlasbergColumbia News
-
go to the Redesigning oral history archives with artificial intelligenceJan 2025Redesigning oral history archives with artificial intelligence Chris PandzaUniversity College London, Technische Universität Darmstadt, Luxembourg Centre for Contemporary and Digital History, Max-Planck-Institut für Wissenschaftsgeschichte
-
go to the Jacqueline Woodson on the dichotomy of today and MLK Day: ‘Nothing we’re living in is new’Jan 2025Jacqueline Woodson on the dichotomy of today and MLK Day: ‘Nothing we’re living in is new’ Kay WickerTheGrio
-
go to the Out of PlaceDec 2024Out of Place J. Khadijah Abdurahman, Bones Jones, Michael FalcoLogic(s)
-
go to the Anthem Award, GoldNov 2024Anthem Award, Gold Chris Pandza, Madeline Alexander, Jacqueline Woodson, Arek Romanski, Lukasz Knasiecki, Magdalena Kesik, et al.International Academy of Digital Arts and Sciences
-
go to the Curating Oral Histories with DataNov 2024Curating Oral Histories with Data Chris PandzaOral History Association
-
go to the Cliff Kuhn Teaching AwardOct 2024Cliff Kuhn Teaching Award Mary Marshall ClarkOral History Association
-
go to the The Obama Presidency Oral HistoryOct 2024The Obama Presidency Oral History Liz Strong, Chris PandzaExtra! by Oral History Review