Computational Social Science Literature DailyReading Project

A daily practice of reading and annotating scholarly literature, inspired by Hongtao Huang's "A Paper A Day". Latest update: 2026-03-09.

Mar 9, 2026
Paper Full paper

The role of style in reader-involvement: Deictic shifting in contemporary poems

Lesley Jeffries

Computational MethodsNarrative Analysis
View Notes & Key Points

Citation: Jeffries, L. (2008). Journal of Literary Semantics (JLSE). Open in Zotero

  1. TL;DR: This article examines whether reader-involvement in a poem can be at least partly explained by stylistic features. By comparing Peter Sansom’s “Mittens” and Mebdh McGuckian’s “Pain tells you what to wear,” the study uses deictic shift theory and blending theory to explain how different linguistic choices draw readers into the story world or alienate them.
  2. Theory & Concepts:
    • Deictic Positioning: “Thus, it is thought that the way that deictic expressions work in narrative is that they cause the reader to take up the deictic positioning of the character, place and time which are indicated by the textual triggers. Unlike everyday interaction, then, where the first person pronoun I typically references the person speaking, the use of first person in a narrative both identifies the narrator and also provides a perspective for the reader to enter the text world.” (Jeffries, 2008, p. 71).
    • Time References: “These alternating time references may cause the readers of this poem, then, to psychologically place themselves repeatedly in the present and then the past of the narrator’s deictic field.” (Jeffries, 2008, p. 74).
    • Merging of Narrative Perspective: “The incongruities of the text, whereby the narrator appears to be at once a small boy and an adult man, emerge more strongly in the second – and final – stanza and create less a sense of deictic shifting between two different ages of the first person narrator than a blending of the two into a new, rather unusual, character who is an adult but is being treated like a child.” (Jeffries, 2008, p. 74). This creates an effect where both past and present identities are simultaneously primed, challenging normal decompression.
    • Merging of Historic Context and Present: “In purely textual terms, then, there is no way of knowing in these first five lines, though the first stanza context may incline us towards the uncomfortable reading that this is genuinely happening in the present to the adult narrator, whilst the disjunction with what we know about the world may incline us to revise the deictic field to one in the past (i.e. a flashback), but described in the historic present tense.” (Jeffries, 2008, p. 75).
    • Voiding the Deictic Field & Generalized Narration: Some poetic openings stop chronological progression by failing to instantiate a specific WHO, WHERE, or WHEN, effectively voiding the deictic field. McGuckian’s poem uses a generalized second-person narrative: “The use of generalized second person pronouns, superficially hints at an addressee, whose referent could be the reader if this were so, but is more likely to refer to some generic third person.” This keeps the reader estranged and at arm’s length.
    • Priming and Prominence: The psychological prominence of textual triggers dictates the depth of reader involvement. “These two poems support such a notion since the deictic fields of ‘Mittens’, though unusual and at times apparently contradictory, are nevertheless strongly primed, whereas those of ‘Pain tells you what to wear’ are less clear and thus less fully primed.”
  3. The Role of Pronouns:
    • First-person “I” (Mittens): Provides an entry perspective for the reader to enter the text world; facilitates identification with the narrator and allows readers to interpret the poem in personally relevant ways; juxtaposing two “ages” of the narrator demonstrates that identity is not singular across a lifetime.
    • Second-person “you” (Pain tells you what to wear): Iterative descriptions induce a generic third-person reading rather than true address; with no overt “I” to identify with, conceptual blending between reader and narrator remains incomplete, creating alienation; the narrator effectively hides behind a generalized pronoun that fails to establish a clear deictic centre.
Mar 7, 2026
Paper Full paper

An Annotated Dataset of Coreference in English Literature

David Bamman, Olivia Lewke & Anya Mansoor

Computational MethodsNarrative Analysis
View Notes & Key Points

Citation: Bamman, D., Lewke, O. & Mansoor, A. (2020). Proceedings of the Twelfth Language Resources and Evaluation Conference. ACL Anthology · Open in Zotero

  1. TL;DR: The paper introduces a literary coreference dataset and shows that English fiction contains distinctive long-distance, bursty entity patterns; models trained on in-domain literary data perform substantially better than those trained on standard news benchmarks.
  2. Theoretical Frame: Computational Literary Studies / coreference resolution domain adaptation.
  3. Key Concepts:
    • Literary coreference: Differs from standard coreference because literary narratives track entities across long spans, shifting narrative perspectives, and character development over the course of a novel.
    • Near-identity: A relation shaped by neutralization/compression and refocusing/decompression, which respectively minimize or maximize perceived differences between discourse entities.
  4. Data & Method: 210,532 tokens from 100 English-language fiction works in LitBank (1719-1922), annotated for ACE-style entity categories and used to evaluate a BERT-based neural coreference system across training domains.
  5. Key Findings:
    1. Most mentions in literary texts are characters/people (83.1%), and pronouns make up 54.3% of mentions.
    2. Long-range entities display bursty narrative behavior, while pronouns have a median distance of only two entities to their nearest antecedent.
    3. A model trained on literary in-domain data achieves an F-score six points higher than a model trained on OntoNotes.
  6. Takeaways:
    • Reusability: The annotation distinctions around generic vs. specific mentions and identity revelation are directly useful for narrative text analysis, especially mystery or character-centered texts.
    • Limitation: The dataset only annotates six ACE entity categories rather than unrestricted coreference.
    • Direct application: Essential for arguing that domain-specific data matters in NLP, especially when adapting entity tracking to literature, culture, or communication research.
  7. Key Quote: “While a model trained on in-domain literary data achieves an F-score six points higher than one trained on OntoNotes, it performs equivalently to a model trained on PreCo”.
Mar 5, 2026
Paper Full paper

Adapting NLP and Corpus Analysis Techniques to Structured Imagery Analysis in Classical Chinese Poetry

Alex Chengyu Fang, Fengju Lo, Cheuk Kit Chinn, Núria Bel, Erhard Hinrichs, Petya Osenova & Kiril Simov

Computational MethodsCultural Studies
View Notes & Key Points

Citation: Fang, A. C., Lo, F., Chinn, C. K., Bel, N., Hinrichs, E., Osenova, P. & Simov, K. (2009). Proceedings of the Workshop on Adaptation of Language Resources and Technology to New Domains. Link

  1. Research Objective: Establish a computational framework for analyzing creative language in classical Chinese poetry using NLP and corpus analysis, focusing on extraction, classification, and structural analysis of poetic “imageries” to investigate inter-poet and intra-poet stylistic differences.
  2. Ontology of Imageries: Built on a complete collection of Tang dynasty poems; words segmented and indexed into semantic classes (synsets) across six main categories: human, affair, time, space, object, and miscellany.
  3. Four Structural Levels of Imagery:
    • Primary imageries: Head nouns with imagery potential (e.g., “winter”)
    • Complex imageries: Primary imageries modified by a premodifier or determiner (e.g., “harsh winter”)
    • Extended imageries: Complex imageries serving clausal functions or acting as predicates
    • Textual imageries: A system of extended imageries designed to articulate artistic conception
  4. Syntactic Parsing: Java-based phrase structure grammar (PSG) parser generates syntactic trees for POS-tagged poetic lines, enabling automatic identification of imagery units.
  5. Case Study — Su Shi vs. Liu Yong:
    • Grammatical: Liu Yong: higher verb proportion (16.2%), fewer adjectives (11.9%) → colloquial/vocal style. Su Shi: fewer verbs (10.2%), more adjectives (16.5%) → dense, formal, scholarly style.
    • Syntactic: Su Shi favored SVO constructions (more subjects, objects, adverbials, complements) typical of formal prose. Liu Yong showed higher predicate and premodifier proportions, consistent with casual style.
  6. Conclusion: Structured sub-categorization of imageries successfully links linguistic analysis (lexical, grammatical, syntactic) to literary evaluations of style; the framework proves effective for automatic analysis of classical Chinese poetry.
Paper Full paper

A Computational Approach to Style in American Poetry

David M. Kaplan & David M. Blei

Computational MethodsCultural Studies
View Notes & Key Points

Citation: Kaplan, D. M. & Blei, D. M. (2007). Seventh IEEE International Conference on Data Mining (ICDM 2007). DOI

  1. Research Question: Computationally capture a comprehensive scope of poetic style; poetry as a domain has “gone largely unexplored” compared to prose.
  2. Place in the Literature: Moves beyond predominant bag-of-words and diction-based approaches to text analysis.
  3. Operationalization: Maps poem text to a high-dimensional vector via multi-layered metrics:
    • Orthographic: Word count, line count, stanza count, average line/word length, lines per stanza; frequencies of the most frequent noun, adjective, and verb as proxies for repetition
    • Syntactic: Frequencies of contractions and parts of speech aggregated at different levels of specificity
    • Phonemic: Alliteration and formal rhyme scheme
  4. Visualization: PCA projection of high-dimensional poem vectors onto two dimensions, preserving greatest variance to depict relative poem similarity.
  5. Validity Risks:
    • POS tagger trained on Wall Street Journal corpus — domain mismatch with poetic language
    • PCA dimensionality reduction may obscure non-linear stylistic patterns
  6. Comments: Pioneering early work on computational poetic style analysis. However, the feature set remains surface-level and does not fully capture the concise construction of narratives.
Paper Full paper

A Computational Analysis of Poetic Style: Imagism and Its Influence on Modern Professional and Amateur Poetry

Justine T. Kao & Dan Jurafsky

Computational MethodsCultural Studies
View Notes & Key Points

Citation: Kao, J. T. & Jurafsky, D. (2015). Linguistic Issues in Language Technology. Link

  1. Research Questions:
    • How do literary movements transform the ideals of poetic beauty?
    • How do changes in aesthetic standards impact different levels of expertise (elite vs. amateur)?
    • Can we measure these shifts using computational methods?
  2. Place in the Literature: Builds on Martindale (1990)’s work on elite corpus measurement; extends to analyzing changes surrounding Imagism and measuring the trickle-down effect from elite to mass aesthetics.
  3. Key Concepts & Operationalization:
    • Concrete imagery: Average concreteness rating of all words → Concreteness score
    • Emotional language: Average valence and arousal ratings from affective norms database → Valence & Arousal scores
    • Sound devices: Identity rhyme, perfect rhyme, slant rhyme, alliteration, consonance, and assonance
    • Diction: Summed word frequencies divided by poem length → WordFreq score
    • Exactness: Unique word types / total word instances → Type-Token Ratio
  4. Research Design: Three comparative analyses — Imagists vs. 19th-century professionals; contemporary vs. 19th-century professionals; contemporary professionals vs. amateurs.
  5. Comments: Effective framework for measuring stylistic features across expertise levels and historical periods. The elite-to-masses comparison is compelling. However, most measurements remain limited to word-level statistics.
Aug 14, 2025
Book Chapter 96–110

The Political Economy of Communication

Vincent Mosco

Political EconomyCommunication
View Notes & Key Points
  1. Social and intellectual factors shaping the political economy of communication:
    • Social driving forces:
    1. Media have been organized along industrial lines, with most media labor working for wages; yet media conglomerates are now so powerful that they can control the circuit of accumulation without retaining the risks of full ownership—flexible accumulation has consolidated global media power
    2. However, political economy is ill-equipped to examine behaviors organized around the household; likewise, a new vocabulary used to discuss audiences challenges political economy—audiences participate in the accumulation process through listening, reading, watching, and purchasing media content
    3. The process of transnationalization: also constitutes the demand for a New World Information and Communication Order (NWICO), while “globalization” frequently and vaguely refers to transnationalization
      • E.g., the Non-Aligned Movement — national self-determination — universal access to communication media, control over production and distribution, and communication as a fundamental human right — examining the role of communication in the U.S.-led postwar restructuring of global capitalism
      • New technologies expand audience control and undermine national sovereignty — embedded imagined American-style imagery — charting the growth and power of transnational communication corporations
      • Information and communication technologies possess the power to restructure the global division of labor
    4. The emergence of the information society:
      • A radical rupture of time, space, and social relations
      • Dan Schiller, Digital Capitalism
      • Open source and hacker networks challenge ownership: a contradiction between the desire for free access to information and capitalism’s drive to harness information solely for generating surplus value
  2. The communication industry is fundamentally incompatible with the traditional classical economic model
Book P225–235, Chapter Four: The Spectacle of the Other

Representation: Cultural Representations and Signifying Practices

Stuart Hall, Ed.

Cultural StudiesCommunication
View Notes & Key Points
  1. Two Guiding Questions:
    • Have the repertoires of representation around “difference” and “otherness” changed, or do earlier traces from colonial and racialized narratives remain embedded in contemporary culture?
    • Is there a possibility for an effective politics of representation that can challenge and transform dominant stereotypical portrayals?
  2. Persistence of Difference and Power:
    • Even with new modes of visual and cultural production, many symbolic codes, visual motifs, and narrative frames of the “racialized Other” still echo colonial-era patterns
    • Old and new forms often overlap: modern advertising, film, and journalism may adopt new aesthetics, yet still rely on deep-seated binaries (civilized/primitive, modern/backward, white/non-white)
  3. Representation as a Site of Power Struggle:
    • The “spectacle of the Other” emphasizes that seeing and being seen are part of a broader power relation
    • Mainstream media and cultural industries do not merely “depict” the Other; they define the Other—this power to represent is central to cultural hegemony
    • Representation is not a one-way process; it is constantly negotiated among producers, texts, and audiences, though dominant discourses generally prevail
  4. Counter-Representation Strategies:
    • Reversing the stereotype: Flipping a negative stereotype into its opposite, re-signifying it in a positive or empowering way
    • Positive imagery: Presenting affirmative, authentic, and diverse representations to offset prevailing negative portrayals
    • Contesting from within: Working within mainstream forms and platforms to insert new narratives that disrupt established codes
    • Hall warns that each approach has limits—reversal can still lock thinking into binary oppositions, while positive imagery may gloss over structural inequalities
Aug 11, 2025
Book Chapter 86–95

The Political Economy of Communication

Vincent Mosco

Political EconomyCommunication
View Notes & Key Points
  1. How to define communication:
    • The rhetoric of conversation should provide the standards of science just as the logic of inquiry does; understanding occurs when two or more persons exchange observations, ideas, and express them in a language that does not merely reveal reality but contributes to constructing it.
    • Two definitional approaches: information transmission versus the construction of meaning
    • Mathematically grounded communication science: Shannon and Weaver — the transmission of information from a communicator and encoder to a decoder
    • Sociological perspective: The process of constructing meaning
    • Market specialists: An interactive process between two or more parties, in which meaning is exchanged through the intentional use of symbols
    • Implications for political economy: Communication studies should view communication systems as integral to the fundamental economic, political, and cultural processes of society: the framework of capitalism’s essential elements — capital accumulation and labor wages
Book P213, Chapter Nine

Rise of the Red Engineers: The Cultural Revolution and the Origins of China's New Class

Joel Andreas

Chinese PoliticsCultural Studies
View Notes & Key Points
  1. Worker-Peasant-Soldier Students’ Attitude: They considered the loss of collective consciousness in new college students as an indicator of the reduction of political rights and participation in the New Era. This phenomenon illustrates a huge generation gap; the traditional habitus no longer existed, replaced by the alternation of ideology with individual self-consciousness.
  2. Middle School Changes: The rebirth of the keypoint classes, demonstrating that a brand-new hierarchy of educational systems had emerged, producing the education elite.
  3. Party Membership Transformation: Shorn of its ideological meaning, party membership retained its instrumental value as a political credential and networking tool, attracting ambitious university students who aspired to public service and leadership positions.
  4. Loss of Inspirational Core: The loss of the inspirational core of communist party membership reduced it to a political credential and networking tool.
  5. Core of New Tsinghua University: The transfer of class power, the flow and change of the knowledge hierarchy.
Paper Continued reading

Quantifying Narrative Similarity Across Languages

Waight, Messing, Shirikov, Roberts, Nagel, Greenfield, Brown, Aslett & Tucker

Computational MethodsNarrative Analysis
View Notes & Key Points

Language Standardization Process:

  1. Standardized Prompt: “Please summarize this news article in 7–10 English sentences. Article: [insert article text]”
  2. Claims Extraction: Prompt GPT-4o to extract the “descriptive, normative, causal, and classificatory claims” and “people, places, things, and events” included in each summary
  3. Bias Mitigation: How to deal with the bias of the LLMs

Two-Step Candidate Generation Process:

  1. Bi-encoder to Cross-encoder: Using the cutoff to filter
  2. Efficient Pairwise Comparison: Reducing computational complexity through hierarchical filtering
Aug 10, 2025
Book Chapter 57–85

The Political Economy of Communication

Vincent Mosco

Political EconomyCommunication
View Notes & Key Points
  1. Marx’s conception of the commodity: all commodities are nothing but a definite quantity of congealed labour-time
  2. A critique within communication studies: orthodox Marxism overemphasizes the instrumentality and productivity of labor, reducing human beings to objects of production or mere factors of production — thus minimizing the other material practices of labor
  3. The political economy of communication proposes to demonstrate how communication and culture constitute material practices, how labor and language are mutually constitutive, and how communication and information form part of the same social activity — the dialectical relationship of the social construction of meaning
  4. The conservative critique of political economy:
    1. Conservatives hold that the growth of wealth originates from an organic order that leads people to respect tradition, which in turn clarifies social roles and the moral foundations compelling their fulfillment
    2. At its core: opposition to empowering the masses; the belief that the essence of economics is the maximization of pleasure — minimizing the sacrifice of what is unwanted while obtaining what is most desired (Jevons, 1965)
    3. Classical economics, under its preoccupation with equilibrium states, focuses on consensual change while neglecting the traditional interest in history — rendering it unable to analyze periods of transformation
    4. The market provides classical and neoclassical economics with their moral justification
  5. Several variants of political economy:
    1. Neoconservatism: Government as the primary beneficiary of regulation — hence the push for deregulation; essentially releasing politics to preserve scientific rigor
    2. Institutional economics
    3. Marxist political economy: Critique of corporate concentration, exploration of state intervention, and response to transformations in production technology, industrial organization, and world markets
    4. Feminist political economy: Incorporating domestic labor into the system and analysis of market exchange; lacking a concrete calculative framework
    5. Environmental political economy: The transformation of the moral horizon
  6. Why political economy is needed:
    1. Economics favors describing statics — everything is resolved in equilibrium, with interest limited to describing incremental change within a given set of institutions
    2. It is unable to incorporate major socioeconomic determinants into its analysis
    3. It tends to treat the market as a natural product of individual interaction, thereby exacerbating social divisions of class, race, and gender
    4. There is an urgent need to disentangle economic orthodoxy from its rhetorical system and to study it as part of a system of power
Paper Initial reading

Quantifying Narrative Similarity Across Languages

Waight, Messing, Shirikov, Roberts, Nagel, Greenfield, Brown, Aslett & Tucker

Computational MethodsNarrative Analysis
View Notes & Key Points

Research Purpose: To study how ideas and narratives diffuse across text, we must first develop a method to identify whether texts share the same information and narratives, rather than the same broad themes or exact features.

Methods: Using large language models to distill texts to their core ideas and then compare the similarity of claims rather than of words, phrases, or sentences. Sorting by SBERT and claimed by GPT-4o.

Research Questions: The spread of Russian claims about the development of a Ukrainian bioweapons program in U.S. mainstream and fringe news websites.

  1. Narrative diffusion: A process that creates narrative similarity as an empirical relic
  2. When conducting or using one word, you should do a related literature review of this specific word
  3. Gold standard approach: Providing infinite resources to identify narrative similarity
  4. LLM-SBERT: Using the key claims and subjects to measure the similarity
  5. How to minimize the scale of large pairwise comparisons:
    1. Use LLMs to distill the core claims and subjects
    2. Using the pretrained BERT-based model to generate a semantic similarity
    3. Using human annotators to label pairs identified as positive cases

Questions: How to deal with LLMs bias?

Aug 2, 2025
Book Chapter 48–56

The Political Economy of Communication

Vincent Mosco

Political EconomyCommunication
View Notes & Key Points
  1. Social relations, particularly power relations, interactively constitute the production, distribution, and consumption of resources; this requires the political economy of communication to examine the constantly shifting modes of control along the lines of production, distribution, and consumption — or how activists use new media to resist the concentration of power. However, within the communication industry, producers, distributors, and consumers cannot be precisely defined: audiences cannot be straightforwardly identified as consumers, for in consuming media products they also produce symbolic value.
  2. An alternative definition of political economy: the study of control and survival in social life.
Jul 28, 2025
Book Chapter 30–47

The Political Economy of Communication

Vincent Mosco

Political EconomyCommunication
View Notes & Key Points
  1. The fundamental analytical framework of the political economy of communication: commodification, spatialization, and structuration
  2. The audience commodity thesis: Workers sell their labor power (attention) in return for the media content they receive; this extends the scope of labor, expands the traditional logic of the commodity, and may serve as an account of the labor transformation process under capitalism
  3. Labor (Braverman, 1974): A conceptual unity of imagination and creative capacity, design and execution; in the process of commodification, capital functions to separate conception from execution, to separate skill from the unskilled performance of tasks, thereby forming a managerial class, while new configurations of skill and power in production are combined to restructure the labor process
  4. From commodification to spatialization: The emergence of information systems enables information to transform space — redistributing industrial resources
  5. The tradition of the political economy of communication treats spatialization as the institutional extension of corporate power within the communication industry
Jul 27, 2025
Book P1–P8

Communist Neo-Traditionalism: Work and Authority in Chinese Industry

Andrew G. Walder

Chinese PoliticsPolitical Economy
View Notes & Key Points
  1. A totalitarian society has two distinguishing characteristics:
    1. The nature of the tie between the totalitarian party and its active adherents
    2. Social atomization: “the obliteration of social ties that are not directly harnessed to the party’s aims. Totalitarian societies recognize no legitimate distinction between private and public spheres.”

Questions: Does social atomization only exist in a totalistic society? What about the rise of media and internet platforms?

  1. He insisted that the modern socialist society became less enforced. Where the totalitarian image places its emphasis on the disincentives and psychological states created by fear and inbred caution, the neo-traditional image emphasizes the meshing of economic and political power on the structured incentives offered by the party.
  2. How the social structure works: The result is a highly institutionalized network of patron-client relations that is maintained by the party and is integral to its rule: a clientelist system in which public loyalty to the party and its ideology is mingled with personal loyalties between party branch officials and their clients.
  3. How it’s maintained: The neo-traditional image stresses the social network, not the group, as its main structural concept.
Paper Full paper

Written for Lawyers or Users? Mapping the Complexity of Community Guidelines

Nahrgang, Weidmann, Quint, Nagel, Theocharis & Roberts

Platform GovernanceComputational Methods
View Notes & Key Points

Previous Problems: As harmful content fostered many problems, and social platforms conducted content moderation, many users held low trust towards the restrictions. However, many actors tried to use community guidelines to restrict users, but created a vast difference. No research has put attention on users’ situation.

Research Questions: Focusing on the length, readability, and semantic complexity of community guidelines.

  1. Platform Typology: Chat platforms, creator platforms, forum platforms, social network platforms
  2. Alt Tech Platforms: Provide alternatives to Silicon Valley-controlled platforms like Gab
  3. Platform Governance Archive (PGA): GitHub repository makes historical versions available
  4. Readability Measurement: Flesch-Kincaid Grade Score relies on average sentence length and average word length in a given text