Wordcloud of user's comments and posts

Question

Wordcloud of user's comments and posts

kaustubhhiware opened this issue 6 years ago · comments

This task could be a bit harder, since you have to process both posts and comments, and then aggregate words from both, to form a wordcloud.

What hashtags do you frequently use?
What language do you often use?
Skip proper nouns, esp. all your friend's names.

Parikansh Ahluwalia · Answer 1 · Tue Jun 26 2018 21:29:34 GMT+0800 (China Standard Time)

@kaustubhhiware Can I please move to this issue, after finishing my current issue, which is almost near its end (I hope so :p) ? This seems really interesting. Will share the details of implementation and toDo soon.
Thanks!

Kaustubh Hiware · Answer 2 · Wed Jun 27 2018 08:22:31 GMT+0800 (China Standard Time)

Sure!
Your PR has been approved by two mentors, let other mentors have a look at it, but mostly there's no work left there.
Assigning this issue to you.

Parikansh Ahluwalia · Answer 3 · Wed Jun 27 2018 20:07:58 GMT+0800 (China Standard Time)

Hi. For the wordcloud, there would be two parts.

Preparing the compiled text from posts and comments. This should not be too tough.
Preparing the wordcloud image. For this, I looked up, there is a python library by the obvious name wordcloud. Tried running it. Worked perfectly. So would be using that to prepare the image.

For hashTags, I would maintain a count and plot the top5 or top10
Language detection is something I don't have an idea about currently. Will have to look up something for that later.

Would this be fine? Thanks!

Anubhav Singh · Answer 4 · Wed Jun 27 2018 22:03:16 GMT+0800 (China Standard Time)

Hey @parimatrix you might want to take a look at TextBlob. It will help you with the language detection and removal of proper nouns! Best of luck.

Kaustubh Hiware · Answer 5 · Thu Jun 28 2018 12:28:47 GMT+0800 (China Standard Time)

Here's an idea (Using this is upto you):
We often tag our friends in memes, right?
Do you think it would be interesting to see which friends you have tagged the most frequently?
Once you have extracted all the text from posts and comments, get a list of names (should be easy), and only retain words in the list. Plot a wordcloud.

Upto you.