Researchers from HSE University at Perm present an analysis of consumer sociology using the BERT model
On June 21, Junior Research Fellow of the Laboratory Yevgeniya Shenkman and Associate Professor of the Department of Management, Candidate of Sociological Sciences Yuliya Papushina presented their research at the scientific seminar of the Department of Management at HSE University Perm. The research was titled "Topics and Dynamics in the Sociology of Consumption from 1976 to 2023: Automated Literature Analysis Using the BERT Deep Learning Model."
The research, whose preliminary results were presented at the seminar, is an automated analysis of publications on the sociology of consumption from 1976 to 2023. This study is highly relevant because automated text analysis is widely used to reflect the evolution of various disciplines, yet similar research in the field of sociology of consumption is still lacking. Machine learning methods help reduce the subjectivity of traditional literature reviews, which involve reading articles, and allow researchers to cover a much larger volume of material and identify non-obvious thematic directions.
The sociology of consumption is a vast and complex area of sociological knowledge, encompassing thousands of studies conducted using various methodologies. Therefore, a comprehensive literature review is impossible without the use of modern machine learning tools. This study is the first of its kind to thematically map the field of sociology of consumption using the BERTopic model. It also demonstrates the capabilities of using different approaches, including individual words, bigrams, and author keywords, in automated literature analysis.
Unlike most studies that use the LDA model for automated literature analysis, this research employs the BERTopic approach (Grootendorst, 2022) to analyze these texts and identify key topics. BERTopic is based on the sequential application of a pre-trained language model like Sentence Transformer to create vector representations of the text, cluster these vectors, and describe the resulting topic clusters using TF-IDF. In her part of the presentation, Yevgeniya Shenkman explained to the audience the principles, advantages, and limitations of the BERTopic approach.
In the second part of the seminar, Yuliya Papushina reported the project's results. She began by addressing the main research questions: What is the overall dynamic within the study period? How many thematic segments can we identify within the study period? Which topics see increasing interest from scholars, and which ones are declining? It turned out that the sociology of consumption is organized around nine themes, which can be grouped around sustainability, food sociology, and consumer behavior. For each theme, the presenters examined the country affiliation of researchers publishing on the topic, the most popular journals in the theme, and the structure of the text flow. They also described the yearly publication dynamics, allowing them to understand how researchers' interest in the nine themes changes over time.
The most important findings of the study include that keyword analysis, bigrams, and author keywords provide different pictures of the thematic clusters' content. Depending on the topic, the dominance of English-speaking countries is challenged by China, India, and European countries. Topics that have seen sustained interest over the past 20 years include "Green" marketing, the Food system, and Sustainable fashion. Topics with declining interest include Consumer practices and identities, Marketing science, and the Circular economy.
Following the presentation, an active discussion ensued, addressing issues such as the impact of the initial data on the findings, the interpretation of results in the context of previous bibliometric studies, and prospects for further research.