dmlc / gluon-nlp

NLP made easy

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPT2 tests mysteriously killed

barry-jin opened this issue · comments


GPT2 tests in tests/ is mysteriously killed. The was found in the recent nightly tests(cu102-2.0.0b20210502 and cu102-2.0.0b20210504).
But, after I tried other MXNet nightly build prior to cu102-2.0.0b20210502, the error was still there. So, I suspect that some changes in other upstream packages resulted in this issues. Need more investigation.

Error Message

root@ce877b5a6d9d:/workspace/gluon-nlp# python3 -m pytest --durations=50 --device='cpu' --verbose --runslow tests/
Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1945812666 to reproduce.
==================================================================== test session starts =====================================================================
platform linux -- Python 3.6.9, pytest-6.2.3, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /workspace/gluon-nlp, configfile: pytest.ini
plugins: cov-2.11.1, mock-3.6.0, flaky-3.7.0, env-0.6.2
collected 55 items                                                                                                                                           

tests/ PASSED                                                                                                  [  1%]
tests/[ctx0-google_albert_base_v2] PASSED                                                                             [  3%]
tests/[ctx0-google_albert_large_v2] PASSED                                                                            [  5%]
tests/[ctx0-google_albert_xlarge_v2] PASSED                                                                           [  7%]
tests/[ctx0-google_albert_xxlarge_v2] PASSED                                                                          [  9%]
tests/[ctx0-gluon_en_cased_bert_base_v1] PASSED                                                                       [ 10%]
tests/[ctx0-google_en_cased_bert_base] PASSED                                                                         [ 12%]
tests/[ctx0-google_en_cased_bert_large] PASSED                                                                        [ 14%]
tests/[ctx0-google_en_cased_bert_wwm_large] PASSED                                                                    [ 16%]
tests/[ctx0-google_en_uncased_bert_base] PASSED                                                                       [ 18%]
tests/[ctx0-google_en_uncased_bert_large] PASSED                                                                      [ 20%]
tests/[ctx0-google_en_uncased_bert_wwm_large] PASSED                                                                  [ 21%]
tests/[ctx0-google_multi_cased_bert_base] PASSED                                                                      [ 23%]
tests/[ctx0-google_zh_bert_base] PASSED                                                                               [ 25%]
tests/[ctx0-gluon_electra_small_owt] PASSED                                                                           [ 27%]
tests/[ctx0-google_electra_base] PASSED                                                                               [ 29%]
tests/[ctx0-google_electra_large] PASSED                                                                              [ 30%]
tests/[ctx0-google_electra_small] PASSED                                                                              [ 32%]
tests/[ctx0-gpt2_124M] SKIPPED (Skipping GPT-2 test)                                                                  [ 34%]
tests/[ctx0-gpt2_1558M] SKIPPED (Skipping GPT-2 test)                                                                 [ 36%]
tests/[ctx0-gpt2_355M] Killed

To Reproduce

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

What have you tried to solve it?


We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s | python

# paste outputs here

Would you try the alpha version?

Would you try the alpha version?

Looks like alpha version works well. The issue may be still in upstream(mxnet) changes, I will try to fix it.

This might be difficult to fix. You may try to see if running GPT-2 alone will still cause the error.