CSS文献阅读

计算社会科学(CSS)课程相关文献阅读笔记。

2026-05

Rethinking Social Media Strategy: Crafting Digital Sensory Appeals to Maximize Customer Engagement

Lee, N. Y., Edelblum, A., Park, K., & Zablah, A. R. (2026). Rethinking social media strategy: Crafting digital sensory appeals to maximize customer engagement. Journal of the Academy of Marketing Science.
查看笔记要点
  • Research Question: This study also focuses on how images and text work together on social media. It starts from a simple effect-based question: can a standalone image generate higher engagement than text paired with an image? More specifically, the paper asks whether digital sensory appeals work better when they are presented only through images rather than through multimodal image-text posts.
  • Literature Logic: The paper moves from sensory appeals in offline marketing environments to digital sensory appeals in social media content design. It then uses transportation theory and cognitive load theory to question the common assumption that multimodal content is always better for social media marketing.
  • Theoretical Framework: The study is based on transportation theory. The key idea is that sensory appeals do not work by providing more information, but by helping users mentally enter the consumption scene shown in the image. A single visual modality can reduce the cognitive load caused by integrating multiple modalities. As a result, users can more easily imagine the sensory experience suggested by the post.
  • Method: The paper uses both secondary data and experiments. In the secondary-data study, the authors analyze 1,041 Instagram posts from a coffee shop in the Midwestern United States between 2011 and 2024. They use regression models to test how post modality and sensory appeals affect engagement. In the experimental studies, they manipulate content modality and appeal type to compare user engagement with sensory and non-sensory content under different presentation formats.
  • Key Findings: For digital sensory appeals, standalone images increase user engagement more than multimodal image-text posts. In the Instagram data, standalone sensory image posts can generate up to 124% more engagement than image-text posts. This effect works mainly through transportation: standalone images make it easier for users to become immersed in the sensory scene, which increases their willingness to engage. For non-sensory appeals, standalone images do not have the same advantage and may even perform worse than image-text content. The paper argues that social media content is not always better when it contains more modalities. When content depends on users’ sensory imagination, less can be more. This is especially relevant for food, drinks, perfume, fashion, and other products where consumers already have related sensory experience or brand loyalty.

Seeing the Surreal: Mapping Surrealism in Photorealistic AI-Generated Images Using Large Language Models

Liu, X., Lu, Y., Peng, Q., Qian, S., Peng, Y., & Shen, C. (2026). Seeing the Surreal: Mapping Surrealism in Photorealistic AI-Generated Images Using Large Language Models. Computational Communication Research, 8(2), 1.
查看笔记要点
  • Research Question: Instead of focusing only on AI’s ability to generate images or users’ ability to detect AI-generated images, this study asks how surrealism appears in photorealistic AI-generated images. It examines what types of surrealism exist, what visual elements express them, and how they reflect the visual logic of the generative AI era.
  • Literature Logic: The paper begins with a gap in current research on AI-generated images. It then introduces surrealism and algorithmic surrealism, treating surrealism as a meaningful content feature in algorithmically mediated visual communication. Based on this logic, the study asks how such content can be described and classified, and how traditional supervised or unsupervised methods can be combined with large language model-based image understanding.
  • Theoretical Framework: The study draws mainly on the artistic theory of surrealism, which emphasizes imagination, dreams, the unconscious, and the breaking of rational order. It may be better to say that the paper uses surrealism as a theoretical lens rather than a full explanatory framework. This lens helps the authors interpret AI-generated images as a new form of algorithmically mediated visual expression.
  • Method: The authors collected 28,290 images from 47 AI image creator accounts on Instagram. After manual cleaning, they retained 26,771 photorealistic AI-generated images. The study uses a large language model-assisted mixed-method framework. First, human annotation and qualitative analysis were used to build a codebook with three types of surrealism: physical surrealism, behavioral surrealism, and contextual surrealism. Then GPT-4o was used to classify the large-scale image sample. After that, GPT-4o generated textual summaries of the images. These summaries were analyzed with LDA topic modeling and topic network analysis to identify recurring visual elements and their co-occurrence patterns in surreal AI images.
  • Key Findings: The study finds that surrealism is a major feature of photorealistic AI-generated images. About 66.9% of the sample contains some form of surrealism, with physical surrealism being the most common type. Further analysis shows that many images present mixed forms of surrealism and repeatedly use certain visual elements. In the discussion, the authors argue that algorithmic surrealism expands visual imagination, but it may also lead to visual homogenization, the reproduction of stereotypes, the aestheticization of technical flaws, and political misinformation.

What Makes Politicians’ Instagram Posts Popular? Analyzing Social Media Strategies of Candidates and Office Holders with Computer Vision

Peng, Y. (2021). What Makes Politicians’ Instagram Posts Popular? Analyzing Social Media Strategies of Candidates and Office Holders with Computer Vision. The International Journal of Press/Politics, 26(1), 143–166.
查看笔记要点
  • Research Question: This study examines how visual features of politicians’ Instagram posts influence audience engagement, such as likes and comments. It focuses on how different visual communication strategies affect public responses on social media. From a methodological perspective, the study is also interested in how computer vision can be used to identify and classify visual themes in political communication.
  • Literature Logic: The paper begins with the personalization of politics and the increasing use of social media by politicians. It then discusses how social media engagement has political implications and asks how personalization is visually expressed online. Drawing on research on self-disclosure and parasocial interaction, the author develops hypotheses about the effects of visual communication strategies on user engagement.
  • Method: The dependent variables are the number of likes and comments received by each Instagram post. The author uses computer vision techniques to identify visual content. Transfer learning and clustering methods are used to classify image types, followed by K-means clustering and manual refinement into four categories. Face++ is used for face detection, facial size measurement, and emotion recognition. The models also control for image aesthetics, posting time, politician characteristics, and account characteristics. Multilevel regression models are employed to test the effects of image categories, the presence of the politician’s face, face size, and emotional expressions on audience engagement.
  • Theoretical Framework: The study is grounded in the theory of political personalization. Personalization is operationalized through visual strategies such as showing private life, displaying the politician’s face, and expressing emotions. The paper also draws on research on social media virality, arguing that emotional arousal, social presence, and perceived intimacy can increase audience engagement.
  • Key Findings: The study finds that most politicians’ Instagram content still reflects traditional “politics as usual,” including meetings, speeches, and government activities. However, posts showing private or non-political situations generally receive more engagement. Posts that include the politician’s face, display a larger facial area, or express emotions also tend to attract more likes and comments. The main implication is that effective political communication on visual social media depends not only on issues and policy positions, but also on how politicians use images to create intimacy, recognizability, and emotional connection with audiences.

Investigating the Effects of Clickbait on User Engagement in Health Communication: A Mixed-Method Study

Deng, Z., Tang, Y., Wu, M., & Zhang, X. (2025). Investigating the effects of clickbait on user engagement in health communication: A mixed-method study. Information & Management, 104231.
查看笔记要点
  • Research Question: This study asks whether the effect of clickbait may be overestimated. Previous studies mainly focused on the textual and syntactic features of clickbait and its direct effect on user engagement. Because health information is closely related to personal interests, this paper examines the psychological mechanisms through which clickbait titles influence users’ clicking and sharing behaviors.
  • Literature Logic: The paper first introduces the clickbait phenomenon on social media platforms. It then reviews how previous studies have examined the effects of clickbait on user behaviors, especially clicking and sharing. After that, it introduces the theoretical framework and develops a mixed-method research design. Compared with many communication studies, its literature review is relatively short and more directly connected to the empirical design.
  • Theoretical Framework: The study uses self-awareness theory and separates user responses into two paths: subjective self-awareness and objective self-awareness. Information gaps direct users’ attention to external information and stimulate curiosity and fear of missing out, which can increase clicking. Emotional intensity directs users’ attention back to the self, making them worry about how others may evaluate their sharing behavior. This can produce fear of negative evaluation and reduce sharing.
  • Method: The study has a complex mixed-method design with four studies: two secondary-data studies, one semi-structured interview study, and one online experiment. Study 1 collects 4,500 articles, uses machine learning to identify clickbait, and runs regression models. Study 2 uses interviews to identify two key features of clickbait: information gap and emotional intensity. Study 3 uses an online experiment to test the psychological mechanisms. Study 4 uses secondary data again to validate the direct effects of information gap and emotional intensity on clicking and sharing. The early machine-learning operationalization is relatively broad, defining clickbait as titles that are obviously exaggerated, suggestive, or non-objective. After the interview study, the concept becomes more detailed and theoretically grounded.
  • Key Findings: Clickbait in health communication has a clear double-edged effect. It can increase clicks but reduce sharing. More specifically, information gaps increase clicking by stimulating curiosity and fear of missing out. High emotional intensity reduces sharing by increasing fear of negative evaluation. The paper also finds that digital literacy weakens the effects of information gaps on curiosity and fear of missing out, while source credibility strengthens the positive effect of information gaps on clicking and reduces the negative effect of emotional intensity on sharing.

Words Meet Photos: When and Why Photos Increase Review Helpfulness

Ceylan, G., Diehl, K., & Proserpio, D. (2024). Words Meet Photos: When and Why Photos Increase Review Helpfulness. Journal of Marketing Research.
查看笔记要点
  • Research Question: This study asks whether reviews with photos are more helpful than reviews without photos. More importantly, it examines whether consumers find reviews more helpful when the information in photos and words is similar or different. The core question is how the relationship between textual and visual information affects review helpfulness through processing fluency.
  • Literature Logic: The paper first explains why review helpfulness matters, because helpful reviews can shape consumer attitudes and behavior. It then highlights the growing role of photos in online reviews. Based on theory, it argues that the coordination between images and words can influence how effectively a review is processed and evaluated.
  • Theoretical Framework: The study treats helpfulness as an indicator of review effectiveness. Because online reviews are often multimodal, the authors examine how text and photos work together. The key mechanism is processing fluency: when photos and words provide similar information, readers can process the review more easily. This fluency creates a more positive feeling, which then leads readers to evaluate the review as more helpful.
  • Method Design: The paper uses a multi-method design, combining large-scale secondary data, machine learning, human validation, and experiments. First, the authors analyze 7.4 million Yelp restaurant reviews and 3.5 million photos. They use Google Vision API to extract image labels, Doc2Vec to transform review text and image labels into vectors, and cosine similarity to measure image-text similarity. They also use human coders to validate whether the algorithmic measure matches human perception. Then, they conduct five experiments to test the causal effect of image-text similarity on review helpfulness, the mediating role of processing fluency, and the boundary conditions of text difficulty and photo quality.
  • Key Findings: Adding photos generally increases review helpfulness. More importantly, reviews are perceived as more helpful when photos and words convey similar information. The mechanism is that image-text similarity makes information easier to process, and easier processing leads to higher perceived helpfulness. The paper also finds that this positive effect becomes weaker when the review text is harder to read or when photo quality is lower. In other words, more visual and textual information is not always better. Effective multimodal communication requires clear, consistent, and easy-to-process combinations of images and words.

The cost of banning TikTok

Donati, D., & Fong, H. (2025). The cost of banning TikTok: Implications for the digital advertising market. Proceedings of the National Academy of Sciences, 122(38), e2512043122.
查看笔记要点
  • Research Question: How a TikTok ban would affect the digital advertising market, especially whether advertisers would shift their budgets to other familiar platforms?
  • Methodology: The two-week temporary suspension provided a natural experiment and a great sample for applying Difference-in-Differences (DID), comparing advertising activity in the United States with that in 32 unaffected countries. (This work serves as a perfect example to study and reproduce DID.)
  • Theoretical Framework: There isn’t an explicit theoretical framework mentioned. However, based on the logic of platform competition or basic demand-supply theory, the authors aim to test whether TikTok and Meta function as substitutable advertising channels, and whether this substitutability varies by advertiser size.
  • Core Findings: On the day of the TikTok outage, ad volume on Meta increased by 6.3% and ad spending increased by 22.4%, but ad impressions did not increase correspondingly. As a result, CPM ad prices rose by 12.1%. The substitution effect was stronger among large advertisers: their Meta ad spending increased by about 67%, compared with about 22% among smaller advertisers. This suggests that large advertisers were better able to shift their TikTok budgets to Meta. The authors therefore argue that a TikTok ban could further strengthen the market power of platforms such as Meta and impose higher switching costs on resource-constrained small businesses.

Extra Cues, Extra Views: A Multimodal Detection of Arabic Clickbait Thumbnail Verbo-Visual Cues

Al-Ali, M. N., & Hamzeh, M. S. M. (2024). Extra cues extra views: A multimodal detection of Arabic clickbait thumbnail verbo-visual cues. Discourse & Communication, 18(1), 3–27.
查看笔记要点
  • Research Question: This study asks which Arabic YouTube thumbnails make users more likely to click and how these thumbnails create false attraction through verbal and visual cues. It focuses on how visual cues, linguistic cues, and image-text strategies work together in clickbait thumbnails.
  • Method: The authors selected 100 typical clickbait thumbnails from five Arabic YouTube channels. They compared these thumbnails with the actual video content to check whether they were misleading or over-promising. The analysis combines Kress and Van Leeuwen’s multimodal analysis framework with Hyland’s metadiscourse framework. The authors coded visual processes, composition, viewer interaction, and linguistic strategies in thumbnail text. This study is not highly computational, but it is useful for understanding the difference between qualitative multimodal analysis and computational aesthetics.
  • Theoretical Framework: The study mainly uses Kress and Van Leeuwen’s visual grammar to analyze representational meaning, interactive meaning, and compositional meaning in thumbnails. For the textual part, it uses Hyland’s metadiscourse theory to examine how self-mentions, attitude markers, engagement markers, forward references, and connectors guide users to click. This framework is closely related to Reading Images: The Grammar of Visual Design.
  • Key Findings: Clickbait thumbnails often use negative actions, shocked facial expressions, close social distance, direct gaze, exaggerated symbols, repeated exclamation marks, repeated ellipses, emojis, and forward references to create suspense. Clickbait thumbnails are not only textual clickbait. They are multimodal persuasion devices built through the cooperation of images, words, composition, and interactive cues.

"8 Amazing Secrets for Getting More Clicks": Detecting Clickbaits in News Streams Using Article Informality

Biyani, P., Tsioutsiouliklis, K., & Blackmer, J. (2016, February). ‘8 Amazing Secrets for Getting More Clicks’: Detecting Clickbaits in News Streams Using Article Informality. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 30, No. 1).
查看笔记要点
  • Research Question: This study aims to develop a machine learning model that can automatically identify clickbait articles in online news streams. The authors seek to understand which textual and structural features distinguish clickbait from regular news content.
  • Conceptualization of Clickbait: The authors argue that clickbait is not the same as spam, fake websites, or fake news. Instead, clickbait refers to content with highly attractive, exaggerated, or misleading headlines that encourage users to click, while the article itself often provides limited information or fails to deliver what the headline promises. Because many news recommendation systems rely on click-through rates, clickbait can gain disproportionate visibility and reduce user experience.
  • Types of Clickbait: The paper identifies eight categories of clickbait: exaggeration, teasing, inflammatory content, formatting-based clickbait, curiosity-driven content, bait-and-switch, ambiguous content, and factually incorrect content. Different types rely on different strategies. For example, some create an information gap through phrases such as ‘You won’t believe…’ or ‘What happened next…’, while others use excessive punctuation, capitalization, or vague promises to attract clicks.
  • Method: The authors collected Yahoo News data consisting of 1,349 clickbait articles and 2,724 non-clickbait articles. They trained a Gradient Boosted Decision Trees model using several groups of features. These included content features (headline length, exclamation marks, question marks, capitalized words, numbers, sentiment words, and clickbait phrases), headline-body similarity features, language informality features (readability, formality scores, slang, profanity, and repeated characters), forward-reference features (e.g., this, that, he, she), and URL characteristics.
  • Model Performance: The model achieved a weighted F1 score of 0.749 on the test set, suggesting that textual and structural features can effectively distinguish clickbait from regular news. One of the most important findings is that language informality is a strong predictor of clickbait. Features such as formality scores, readability levels, slang usage, headline length, capitalization, question marks, and exclamation marks all contribute substantially to prediction performance. Headline-body similarity is also useful, although it is less effective when used alone.
  • Key Findings: Different types of clickbait vary in detection difficulty. Exaggeration-based and formatting-based clickbait are easier to detect because they contain obvious linguistic and stylistic cues. In contrast, curiosity-driven, bait-and-switch, and factually incorrect clickbait are more difficult to identify because they often depend on images, videos, or factual verification rather than text alone.
  • Summary: This paper is one of the earliest studies to move clickbait research from conceptual discussion to automated detection. It provides a practical typology and feature framework that has influenced later research. A key implication is that clickbait should not be understood only through headlines themselves, but also through the relationship between headlines and content, as well as the use of information gaps, informal language, exaggerated formatting, and forward references. However, the study focuses mainly on English news articles and text-based features. It cannot fully capture visual or multimodal clickbait, making it less suitable for platforms such as YouTube or short-video services where thumbnails and visual cues play a central role.

Clicks for Money: Predicting Video Views Through a Sentiment Analysis of Titles and Thumbnails

Cui, G., Chung, Y., Peng, L., & Wang, Q. (2024). Clicks for money: Predicting video views through a sentiment analysis of titles and thumbnails. Journal of Business Research, 183, 114849.
查看笔记要点
  • Research Question: Many content creators use emotional and attention-grabbing thumbnails and titles to attract clicks. However, it remains unclear whether these emotional cues increase video views or whether they are perceived as clickbait and discourage users. This study examines how emotions in thumbnails and titles influence video popularity on YouTube.
  • Method: The authors collected 16,215 YouTube video thumbnails and recorded their view counts one week after publication. They combined OCR, YOLOv3, EmoNet, VADER, CLIP, and negative binomial regression models to extract and test textual features, visual emotions, and image-text congruence as predictors of video views.
  • Theoretical Framework: The study is based on schema theory, image schema theory, the two-stage visual processing framework, and curiosity gap theory. The authors argue that users first process salient visual cues and then interpret emotional meanings and image-text consistency. Based on these theories, they propose that emotional valence influences video views, emotional intensity has a curvilinear (inverted U-shaped) relationship with views, and higher image-text congruence increases popularity.
  • Key Findings: Strong emotions expressed in thumbnails increase video views. Both positive and negative facial expressions can attract user attention. In contrast, highly emotional text, question-style titles, and overly clickbait-like wording tend to reduce video views. In addition, videos with higher image-text congruence receive more views than those with mismatched thumbnails and titles.

From Metrics to Insights: Computational Analysis of Visual Data in the Age of AI

Shen, C. (2025). From Metrics to Insights: Computational Analysis of Visual Data in the Age of AI. Visual Communication Quarterly, 32(1), 83-84.
查看笔记要点
  • A brief introduction that discusses some of the challenges in visual communication—particularly those related to computation and quantification—and how to address them
  • First, we need to find meaningful benchmarks against which to compare and contrast thesevisual metrics.construct a baseline for compare the visual metrics
  • We need to condense and combine low-level visual metrics into meaningful latent clusters and condense these metrics into a suitable dimension and create an appropriate encoding to run the regression.
  • I have found it quite challenging to link existing quantitative metrics with traditional, purely theoretical approaches. How can we use these metrics to inform and advance theoretical frameworks? Which visual metrics and features extracted from images and videos can help us understand specific aspects of visual perception and narrative?
  • This is very thought-provoking. With so many visual variables, it’s difficult to approach topic selection from a variable-based perspective; instead, we need to start from a theoretical foundation and consider which metrics can be effectively utilized.
2026-04

How Visual Aesthetics and Calorie Density Predict Food Image Popularity on Instagram: A Computer Vision Analysis

Sharma, M., & Peng, Y. (2024). How visual aesthetics and calorie density predict food image popularity on Instagram: A computer vision analysis. Health Communication, 39(3), 577–591.
查看笔记要点
  • Research Question: This study examines why some food photos on Instagram receive more user engagement than others. The main question is whether visual aesthetic features and calorie density affect the popularity of food images. The authors also explore whether low-calorie foods can gain more attention through better visual design.
  • Method: The authors collected 53,894 images posted by 90 popular food-related Instagram accounts over two years. After data cleaning, 43,978 food images were retained. Computer vision techniques were used to measure visual features such as color, brightness, color richness, feature complexity, compositional complexity, color diversity, and repetition. Clarifai and Nutritionix were used to estimate calorie density. Multilevel regression models were applied to predict likes and comments. In addition, a crowdsourcing survey was conducted to validate whether the computer-generated measures matched human perceptions.
  • Theoretical Framework: The study draws on theories of visual aesthetics, emotional arousal, and food perception. Warm colors such as red, orange, and yellow are expected to increase arousal and make images more attractive. Visual complexity may also attract attention and increase engagement. From a health communication perspective, the authors argue that high-calorie foods are naturally appealing, while low-calorie foods may depend more on visual aesthetics to gain attention.
  • Key Findings: Red, orange, and yellow colors, feature complexity, and repetition significantly increased likes and comments. In contrast, brightness, color richness, and compositional complexity were negatively associated with engagement. Images of higher-calorie foods generally received more engagement, although this effect was not unlimited. Extremely high-calorie foods were not always more popular than moderately high-calorie foods. Most importantly, visual aesthetics had a stronger effect on low-calorie food images, suggesting that effective visual design can improve the appeal of healthy foods on social media.