yandexdataschool / nlp_course

YSDA course in Natural Language Processing

Home Page:https://lena-voita.github.io/nlp_course.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Installing libraries

justheuristic opened this issue · comments

If you have any issues with libraries, post 'em here.

We assume that you have basic data science toolkit (sklearn, numpy/scipy/pandas). Basically whatever comes with default anaconda distribution.

If you don't/can't install that (e.g. you use windows and installation is tricky), there's a docker container available (see below).

Manual install

  • NLP: pip install --upgrade nltk gensim bs4 editdistance
  • Tensorflow: pip install --upgrade tensorflow keras
  • Other: pip install bokeh tqdm

Installing with GPU

To enable GPU on tensorflow,

  • uninstall tensorflow if you have it, pip uninstall tensorflow (or conda uninstall tensorflow if you used anaconda)
  • if you use conda (any OS), try conda install tensorflow-gpu
  • without conda (linux / mac OS only), just pip install tensorflow-gpu. Make sure you have the appropriate cuda toolkit

Install with docker

Clone course repo from dockerhub
(or just docker pull justheuristic/nlp_course if you have docker shell)

If you want to build it yourself, use these instructions.

If you run into any trouble, feel free to post here, even if it's like "i don't know what the hell are all these words!".

commented

hi there, I use this command on my Macbook Pro

docker pull justheuristic/nlp_course

and I can see the notebook from my browser with url "localhost:8888"
but there is an issue that I have download the WEEK 1 ru & uk language data, but I can NOT load them from the web, I don't know why, please help me.

I am sure about the data in the same directory with homework.ipynb, and the

uk_emb = KeyedVectors.load_word2vec_format("./cc.uk.300.vec")

and below codes are hanging out, that's issue.

It is likely caused by the fact that notebook and data are stored in different paths.

To debug this issue, run !pwd in your docker and compare it with the path of your data.

To be sure, you can manually upload everything inside jupyter by using the "Upload" button in the top-right (in jupyter main menu)
image

If the problem proves difficult to solve, you can also run in colab using links like this
image

commented

I am sure the data and the source file in the same folder, and SOMETIMES the first load sentense can be done , but the second load sentense can NOT, and need a lot of time hanging there.

commented

Can you give me the full Colab link of this Course? Thank you very much.

For each week, you will be able to find the respective links in the README, e.g. here

image

commented

thank you very much👍