07/06/2020

Towards Using Word Embedding Vector Space for Better Cohort Analysis

Mohamed Bahgat, Steve Wilson, Walid Magdy

Keywords: clusters, communities, discussions, embeddings, groups, health, mental health, reddit, spaces, tools, word embeddings, words

Abstract: Social media platforms can provide a place for users to express their opinions, interact with others and reflect on their personal experiences. On websites like Reddit, users join communities where they discuss specific topics which cluster them into possible groups of cohorts. These cohorts provide opportunity to analyse individuals with specific tendencies. The authors within these cohorts have the opportunity to post more openly under the blanket of anonymity, and such openness provides a more accurate signal on the real issues individuals are facing. Some communities within Reddit contain discussions about mental health struggles such as depression and suicidal ideation. To better understand and analyse these individuals, we propose to exploit properties of word embeddings that group related concepts close to each other in the embeddings space. For the posts from each topically situated sub-community, we build a word embedding model and use handcrafted lexicons to identify emotions, values and psycholinguistically relevant concepts. We then extract insights into the way that users perceive these concepts by measuring the distance between them and references made by users either to themselves, others or other things around them. We put our tool to the test and see if we can extract meaningful signals.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICWSM 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers