2526-2-FDS01Q013: Available Master’s Thesis Topics in NLP and Social Media

Available Master’s Thesis Topics in NLP and Social Media

Dear students,

The following two Master's thesis topics are available for students interested in NLP and Social Media. Those who are interested are kindly invited to contact me for further information.

Best regards,

Marco Viviani

_____________________________________________________________________________________________

Data Science Methods for Understanding Vaccine Hesitancy in Social Media

This thesis aims to investigate vaccine hesitancy through the analysis of social media data using data science and Natural Language Processing techniques. Starting from a review of the state of the art on vaccine hesitancy, social media analysis, and behavior change theories, the work will focus on collecting or reusing existing datasets related to online vaccine discussions.

The main objective is to develop a computational pipeline to identify and characterize hesitant attitudes, recurring topics, sentiment, and behavioral dimensions emerging from social media conversations. Machine learning and NLP methods, including text classification, topic modeling, sentiment analysis, and possibly transformer-based models, will be used to detect patterns associated with trust, perceived risk, misinformation, social norms, and intention to vaccinate.

The expected outcome is a data-driven characterization of vaccine hesitancy in online discussions, together with an evaluation of computational methods for its automatic detection and interpretation.

_____________________________________________________________________________________________

Detecting Cyberbullying in Online Platforms: Signals, Methods, and Moderation Strategies

This thesis aims to investigate how cyberbullying can be detected in online platforms by combining a study of the scientific literature with a data science perspective. The work will review existing approaches for identifying harmful interactions, focusing on NLP-based signals such as abusive language, toxicity, sentiment, intent, hate speech, threats, sarcasm, and conversational context, as well as non-textual signals such as user behavior, interaction patterns, network structure, reporting mechanisms, and temporal dynamics.

The thesis will also analyze computational methods for cyberbullying detection, including traditional machine learning, deep learning, transformer-based models, multimodal approaches, and graph-based techniques. A further objective is to study possible countermeasures and moderation mechanisms adopted in online platforms, such as automatic content filtering, user reporting, warning systems, human-in-the-loop moderation, community guidelines enforcement, and educational or preventive interventions.

The expected outcome is a structured overview of the main signals, detection techniques, and moderation strategies for cyberbullying in online environments, highlighting current limitations and opportunities for future data-driven solutions.