Refine
Year of publication
- 2025 (1)
Document Type
- Master's Thesis (1)
Language
- English (1)
Has Fulltext
- yes (1)
Is part of the Bibliography
- yes (1)
Keywords
- Persona (1)
Institute
Understanding user behaviors and preferences is crucial in today’s digital landscape, driving the need for automated persona generation. This thesis explores the potential of topic modeling and sentiment analysis to enhance data-driven persona creation. Analyzing a corpus of 676,000 tweets from 6,760 Twitter users (now x.com1), the study applies BERTopic for topic modeling and VADER for sentiment analysis to identify distinct themes and emotional tendencies in usergenerated content. A key finding is the significant impact of pre-processing, which improves topic coherence andinterpretability, contradicting claims that BERTopic performs equally well on raw data. The results indicate that bots predominantly generate neutral, task-oriented content, while humanusers – particularly female users – express more varied and emotionally rich sentiment.Integrating topic modeling and sentiment analysis enables multidimensional persona creationby combining thematic interests with emotional characteristics, emphasizing the value of author profiling in data-driven persona generation.This thesis highlights the potential of text mining techniques in persona creation while acknowledging challenges such as sentiment misclassification and the differentiation between bots and humans. Moreover, the findings highlight the need for structured datasets to enhance large language model-based persona descriptions, ensuring greater accuracy and coherence.Future research should explore alternative machine learning models, refine clustering methods, and assess cross-platform applicability. The combination of topic modeling and sentiment analysis offers promising opportunities for automating persona generation, enhancing e.g.,targeted marketing, and improving social media analysis.
