Nouns

Question

Nouns

molliem opened this issue 6 years ago · comments

Letters for women are more likely to use adjectives instead of nouns

Mollie Marr · Answer 1 · Fri May 04 2018 02:34:52 GMT+0800 (China Standard Time)

Goal: Develop code that can read text for the presence of nouns that highlight roles/positions (like leader, researcher). If position nouns are absent, return a summary statement that directs the author to consider using nouns to strengthen the letter.

This one can be complicated. The goal is to differentiate between descriptions that use adjectives, verbs, or weaken the position noun (i.e., she was involved in research, she taught).

Vassiki Chauhan · Answer 2 · Sun May 06 2018 06:50:37 GMT+0800 (China Standard Time)

This project sounds amazing, congrats on pitching such a cool project. I'm going to start putting together a script to track frequency of nouns and adjectives for different letters, and potentially play with sentiment analysis to figure out if letters for males show a stronger positive sentiment indicating a higher use of superlatives. I do most of my coding in python, is that going to be a problem?

Catherine DeJager · Answer 3 · Wed May 09 2018 09:26:00 GMT+0800 (China Standard Time)

I am also interested in working on this problem. One way to approach this is to make a list of relevant nouns and their corresponding verbs and check the relative frequencies of these (e.g., if "leads" or "led" is used more than "leader"). That seems like a fairly simple first step and I could work on that. It's probably also a good idea to use POS tagging to detect passive voice, as that would catch things like "was involved".

What programming language are we using? I'm most comfortable with Python, though I've done some coding in Perl (I know other languages, but I don't think any of them would be good for this sort of problem). What sort of POS tagging would we use? I've used TreeTagger for Python and Lingua::EN::Tagger for Perl, but I know Python's nltk has several POS taggers built in. I've also used Spacy a little bit, but I'm less familiar with that.

Mollie Marr · Answer 4 · Wed May 09 2018 12:25:36 GMT+0800 (China Standard Time)

Python is prefect! That is the language I know best! I'm still learning programming, so that isn't saying much! I am working on setting up a website (www.biascorrect.com). Hoping to have that ready to go by Thursday.

Feel free to use the POS tagging you are most comfortable with! Please remember to add your names to the contributors page as well. I want to be certain to recognize all the contributions.

Mollie Marr · Answer 5 · Wed May 09 2018 12:32:44 GMT+0800 (China Standard Time)

Thank you both for the kind words, support, and help!

Jordan Matelsky · Answer 6 · Thu May 17 2018 06:06:31 GMT+0800 (China Standard Time)

@molliem — love love love this project, and looking forward to helping!

I did a quick search of the repo and it doesn't look like anyone's mentioned proselint here. This is a general prose-checking framework (tips like weasel_words.very: don't use the word 'very', or typography.symbols.curly_quotes Use curly quotes “”, not straight quotes "".), and the needs of this project reminded me of proselint's plugin-based architecture.

In short, each 'plugin' has its own rules, ways of checking, and error messages — and each is completely independent of the others. So a adjectives_vs_nouns plugin can use a totally different technology to check for bias than stereotypes plugin.

Thought I'd drop the link here in case it's a useful reference, but in the meantime, looking forward to getting started wherever is most helpful!

Mollie Marr · Answer 7 · Fri May 18 2018 02:42:51 GMT+0800 (China Standard Time)

@j6k4m8 Thank you for the kind words! And thanks for the link to proselint!! I hadn't heard of it and it is a fabulous reference!

Although we haven't been tackling this project as plugins, our approach feels similar. I divided the issues up into topics and people have been working on scripts for each topic. My plan at the end is to use a wrapper or a for loop to combine the separate topics into one.

Do you know python? There are four issues that no one has tackled: superlatives, family life, minimal assurance, and raises doubt. Help on any of those would be great. If you know web design, I could use some help there too. It is pretty plain.

Thanks for reaching out! Excited to have you join the team!

Jordan Matelsky · Answer 8 · Fri May 18 2018 02:44:46 GMT+0800 (China Standard Time)

Python or web-design or both! Up to you, wherever you'd prefer to have more help!

Mollie Marr · Answer 9 · Fri May 18 2018 02:49:34 GMT+0800 (China Standard Time)

Amazing! It would be great if you could work on family life, minimal assurance or raises doubt (any of them). My goal was to identify the presence of words and phrases associated with these areas and give feedback, but also to highlight the words in the text box. If you need help with word lists, I can probably tackle that this weekend.