ratsgo / embedding

한국어 임베딩 (Sentence Embeddings Using Korean Corpora)

Home Page:https://ratsgo.github.io/embedding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

word2vec visualize_words 오류

gyunggyung opened this issue · comments

corpus = "texts.txt"
model.visualize_words(corpus) #,test.png)

학습을 완료하고 해당 코드를 실행시 오류가 발생합니다.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-2733c69db10f> in <module>
      1 corpus = "texts.txt"
----> 2 model.visualize_words(corpus) #,test.png)

C:\github\과제\models\word_eval.py in visualize_words(self, words_fname, palette)
    189                         words.add(word)
    190         vecs = np.array([self.get_sentence_vector(word) for word in words])
--> 191         visualize_words(words, vecs, palette)
    192 
    193     def visualize_between_words(self, words_fname, palette="Viridis256"):

C:\github\과제\models\visualize_utils.py in visualize_words(words, vecs, palette, filename, use_notebook)
    195         show(plot)
    196     else:
--> 197         export_png(plot ,filename)
    198         print("save @ " + filename)
    199 

TypeError: export_png() takes 1 positional argument but 2 were given


저도 BERT NSMC Fine-Tuning 후 Visualize 과정에서 같은 에러가 발생합니다.

>>> sentences = ["이 영화 엄청 재미있네요", "이 영화 엄청 재미없네요"]

>>> model.visualize_sentences(sentences)
2020-04-20 11:01:29.929900: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pjw1210/embedding/models/sent_eval.py", line 164, in visualize_sentences
    visualize_sentences(vecs, sentences, palette, use_notebook=self.use_notebook)
  File "models/visualize_utils.py", line 32, in visualize_sentences
    export_png(plot, filename)
TypeError: export_png() takes 1 positional argument but 2 were given

>>> model.visualize_self_attention_scores("이 영화 엄청 재미있네요")
BokehDeprecationWarning: Importing from_networkx from bokeh.models.graphs is deprecated and will be removed in Bokeh 3.0. Import from bokeh.plotting instead
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pjw1210/embedding/models/sent_eval.py", line 253, in visualize_self_attention_scores
    visualize_self_attention_scores(tokens, scores, palette, use_notebook=self.use_notebook)
  File "models/visualize_utils.py", line 174, in visualize_self_attention_scores
    export_png(plot, filename)
TypeError: export_png() takes 1 positional argument but 2 were given

@gyunggyung 님, @JeongwoonPark 님 꼼꼼하게 리뷰해주셔서 감사드립니다.
제가 도커 이미지를 띄워서 처음부터 모델을 학습한 후 재현해본 결과 다음과 같이 이상 없이 실행되는 걸 확인할 수 있었습니다.

추정컨대 export_png 함수가 포함돼 있는 bokeh 패키지의 버전 문제 때문으로 보이는데요. 도서에서 제공하는 도커 이미지에서는 bokeh 1.2.0을 쓰고 있습니다. 도커 환경에서 수행하거나 bokeh 버전을 바꿔보는 건 어떨지 싶습니다. (추가로 이 이슈에서 시각화 관련 버그를 수정해서 /notebooks/embedding 위치에서 git pull origin master 실행 추천드립니다)

참고로 다음은 도서에서 제공하는 도커 이미지의 python 패키지 버전들입니다(pip list로 확인).

absl-py             0.6.1
astor               0.7.1
atomicwrites        1.3.0
attrs               19.1.0
backcall            0.1.0
bleach              3.0.2
bokeh               1.2.0
boto                2.49.0
boto3               1.9.180
botocore            1.12.180
certifi             2019.6.16
chardet             3.0.4
cmake               3.14.4
cycler              0.10.0
decorator           4.3.0
defusedxml          0.5.0
docutils            0.14
entrypoints         0.2.3
fasttext            0.9
funcy               1.12
future              0.17.1
gast                0.2.0
gensim              3.7.3
grpcio              1.16.0
h5py                2.8.0
idna                2.8
importlib-metadata  0.18
ipykernel           5.1.0
ipython             7.1.1
ipython-genutils    0.2.0
ipywidgets          7.4.2
jedi                0.13.1
Jinja2              2.10
jmespath            0.9.4
joblib              0.13.2
JPype1              0.7.0
jsonschema          2.6.0
jupyter             1.0.0
jupyter-client      5.2.3
jupyter-console     6.0.0
jupyter-core        4.4.0
Keras-Applications  1.0.6
Keras-Preprocessing 1.0.5
khaiii              0.4
kiwisolver          1.0.1
konlpy              0.5.1
lxml                4.3.4
Markdown            3.0.1
MarkupSafe          1.1.0
matplotlib          3.0.1
mecab-python        0.996-ko-0.9.2
mistune             0.8.4
more-itertools      7.1.0
nbconvert           5.4.0
nbformat            4.4.0
networkx            2.3
notebook            5.7.0
numexpr             2.6.9
numpy               1.15.4
packaging           19.0
pandas              0.23.4
pandocfilters       1.4.2
parso               0.3.1
pathlib2            2.3.4
pexpect             4.6.0
pickleshare         0.7.5
Pillow              5.3.0
pip                 19.1.1
pluggy              0.12.0
prometheus-client   0.4.2
prompt-toolkit      2.0.7
protobuf            3.6.1
psutil              5.6.3
ptyprocess          0.6.0
py                  1.8.0
pybind11            2.3.0
pycurl              7.43.0
Pygments            2.2.0
pygobject           3.20.0
pyLDAvis            2.1.2
pyparsing           2.3.0
pytest              5.0.0
python-apt          1.1.0b1+ubuntu0.16.4.2
python-dateutil     2.7.5
pytz                2018.7
PyYAML              5.1.1
pyzmq               17.1.2
qtconsole           4.4.2
requests            2.22.0
s3transfer          0.2.1
scikit-learn        0.20.0
scipy               1.1.0
selenium            3.141.0
Send2Trash          1.5.0
sentencepiece       0.1.82
setuptools          40.5.0
six                 1.11.0
sklearn             0.0
smart-open          1.8.4
soynlp              0.0.492
soyspacing          1.0.15
tensorboard         1.12.0
tensorflow-gpu      1.12.0
termcolor           1.1.0
terminado           0.8.1
testpath            0.4.2
tornado             5.1.1
traitlets           4.3.2
urllib3             1.25.3
wcwidth             0.1.7
webencodings        0.5.1
Werkzeug            0.14.1
wheel               0.32.2
widgetsnbextension  3.4.2
zipp                0.5.1