GPT2 tests mysteriously killed
barry-jin opened this issue · comments
Description
GPT2 tests in tests/test_models.py is mysteriously killed. The was found in the recent nightly tests(cu102-2.0.0b20210502 and cu102-2.0.0b20210504).
But, after I tried other MXNet nightly build prior to cu102-2.0.0b20210502, the error was still there. So, I suspect that some changes in other upstream packages resulted in this issues. Need more investigation.
Error Message
root@ce877b5a6d9d:/workspace/gluon-nlp# python3 -m pytest --durations=50 --device='cpu' --verbose --runslow tests/test_models.py
Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1945812666 to reproduce.
==================================================================== test session starts =====================================================================
platform linux -- Python 3.6.9, pytest-6.2.3, py-1.10.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /workspace/gluon-nlp, configfile: pytest.ini
plugins: cov-2.11.1, mock-3.6.0, flaky-3.7.0, env-0.6.2
collected 55 items
tests/test_models.py::test_list_backbone_names PASSED [ 1%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_base_v2] PASSED [ 3%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_large_v2] PASSED [ 5%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xlarge_v2] PASSED [ 7%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xxlarge_v2] PASSED [ 9%]
tests/test_models.py::test_get_backbone[ctx0-gluon_en_cased_bert_base_v1] PASSED [ 10%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_base] PASSED [ 12%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_large] PASSED [ 14%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_wwm_large] PASSED [ 16%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_base] PASSED [ 18%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_large] PASSED [ 20%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 21%]
tests/test_models.py::test_get_backbone[ctx0-google_multi_cased_bert_base] PASSED [ 23%]
tests/test_models.py::test_get_backbone[ctx0-google_zh_bert_base] PASSED [ 25%]
tests/test_models.py::test_get_backbone[ctx0-gluon_electra_small_owt] PASSED [ 27%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_base] PASSED [ 29%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_large] PASSED [ 30%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_small] PASSED [ 32%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_124M] SKIPPED (Skipping GPT-2 test) [ 34%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_1558M] SKIPPED (Skipping GPT-2 test) [ 36%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_355M] Killed
To Reproduce
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python
# paste outputs here
Would you try the alpha version?
Would you try the alpha version?
Looks like alpha version works well. The issue may be still in upstream(mxnet) changes, I will try to fix it.
This might be difficult to fix. You may try to see if running GPT-2 alone will still cause the error.