keon / awesome-nlp

:book: A curated list of resources dedicated to Natural Language Processing (NLP)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clean Up and Update Single Exchange Dialogs

NirantK opened this issue · comments

The Single Exchange Dialogs section is ambiguous, too broad and out of date. Here is how you can help us improve this:

  1. Removing links which you think do not fit in the section. Don't worry about damaging this repository, we can have a discussion on the PR you raise.
  2. Consider adding 2-3 code examples and datasets
  3. Consider adding 2-3 examples from slot filling (sequence mining for text is welcome too) and other approaches in chatbots

Hey @NirantK , in code examples can I add how one can use CountVectorizer and Tf-idfVectorizer? And how pre-trained vectors can be used in NLP?

Hey @anu0012 , if the vectorization methods specific to Dialogs and chatbots - feel free to add them to that section. I guess they are not.

If you are asking if you can add them in general, consider adding them to the tutorial section - if these are not already covered.

https://github.com/anu0012/Predict_the_happiness_challenge/blob/master/notebook.ipynb

In this notebook, I have used Tf-IDF Vectorizer. I used several concepts like text-cleaning, lemmatization, stemming etc. in this script. Can I add this?

No, @anu0012 that does not meet our requirements just yet. Please refer the tutorials section to get an estimate of the quality needed to be included here.

I am sure you can polish it to make it awesome and help the community in the process!

@NirantK, I first checked the links here.

  1. the RNNLM toolkit link is broken.
  2. The other papers have working links
    New stuff worth adding
  • SPMF , a Java library for pattern mining
  • A Sequential pattern mining tutorial and a 'hands-on' thingy
  • This code repo is dual LSTM encoder for dialog response generation from the Ubuntu corpus.
    Anything I am missing out or mistaking for something?

I understand. 😄 . I mistook it for something. No problem, I will open up the needed issue and look more in the DialogCI and ubuntu corpus thing

https://www.tidytextmining.com

I think this can be added in reading section. What do you think @NirantK ?

@anu0012 good find. Since this is an entire book and not a one-off tutorial, let's create a new section under tutorials Books and add there.

This becomes our excuse to make some progress on #5 as well.

Thank you @the-ethan-hunt.

I have fixed the broken link and closed #105.

As a quick note, Dialogflow is a tool for making Human-Computer Interaction systems (or HCI). In layman words, it is a tool for making chatbots.

Hey @the-ethan-hunt, do consider continue contributing to awesome-nlp. Take a look at this issue if you'd like :)

Sure @NirantK ! But is there any other issue I might possibly work on? 😅

Sure @the-ethan-hunt.

Thanks for adding Korean from #98 but did not make enough progress on Chinese, Japanese or any European languages for that matter. It'd be awesome if we'd take that issue to its due conclusion.

It saves a lot of time for the community to have all of the best tools for a particular language in one place.

Hey @anu0012, are you still interested in working on this? We could really appreciate a hand here :)

Sure @NirantK. In the second point which you mentioned what type of code examples and dataset can be added?

@anu0012 Chatbots, virtual assistants and any other popular form of conversational interfaces is a good starting point.

E.g. there is some work on chatbots from Microsoft and Facebook both, check for what datasets they've used and if we can mention them here. Similarly, there is some work on intent detection etc, maybe look if that is relevant?

If at the end of all of this search, we are still unsatisifed with the quality and breadth of coverage, maybe we can merge this section with Conversational Q&A which has similar technical challenges imho. I'd be mostly going by your (and community's) recommendation and findings on the same.