Memory allocation failed out of memory #19420

barry-jin · 2020-10-23T23:05:59Z

Description

Run GluonNLP full suite of tests with pytest on mxnet-cu102==2.0.0b20201022 will introduce threading error (see Error Message).
But run full suite of tests on mxnet-cu102==2.0.0b20201016 will not introduce this error.
Also, run these tests separately will not introduce this error.

Error Message

Run GluonNLP pytest with `mxnet-cu102==2.0.0b20201022`

[2020-10-22T21:15:51.430Z] ============================= test session starts ==============================
[2020-10-22T21:15:51.430Z] platform linux -- Python 3.6.9, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
[2020-10-22T21:15:51.432Z] rootdir: /workspace/gluon-nlp, configfile: pytest.ini
[2020-10-22T21:15:51.432Z] plugins: cov-2.10.1
[2020-10-22T21:15:52.426Z] collected 1283 items
[2020-10-22T21:16:01.630Z] tests/test_attention_cell.py ........................................... [  3%]
[2020-10-22T21:16:06.668Z] ......................................................................   [  8%]
[2020-10-22T21:16:06.796Z] tests/test_data_batchify.py ............................................ [ 12%]
[2020-10-22T21:16:21.672Z] .................................                                        [ 14%]
[2020-10-22T21:16:30.051Z] tests/test_data_filtering.py .....                                       [ 15%]
[2020-10-22T21:16:36.895Z] tests/test_data_loading.py .                                             [ 15%]
[2020-10-22T21:16:37.213Z] tests/test_data_sampler.py ............................................. [ 18%]
[2020-10-22T21:16:38.566Z] ........................................................................ [ 24%]
[2020-10-22T21:16:40.003Z] ........................................................................ [ 30%]
[2020-10-22T21:16:40.579Z] ........................................................................ [ 35%]
[2020-10-22T21:16:41.143Z] ........................................................................ [ 41%]
[2020-10-22T21:16:42.040Z] ........................................................................ [ 46%]
[2020-10-22T21:16:42.299Z] ...............                                                          [ 48%]
[2020-10-22T21:18:34.088Z] tests/test_data_tokenizers.py ..............                             [ 49%]
[2020-10-22T21:18:34.095Z] tests/test_data_vocab.py .                                               [ 49%]
[2020-10-22T21:22:22.268Z] tests/test_embedding.py ..                                               [ 49%]
[2020-10-22T21:22:59.289Z] tests/test_gluon_block.py .....                                          [ 49%]
[2020-10-22T21:22:59.328Z] tests/test_initializer.py ...                                            [ 49%]
[2020-10-22T21:23:00.225Z] tests/test_layers.py ...........................                         [ 52%]
[2020-10-22T21:23:00.312Z] tests/test_loss.py ........................                              [ 53%]
[2020-10-22T21:37:39.851Z] tests/test_models.py ................................................    [ 57%]
[2020-10-22T21:38:46.438Z] tests/test_models_albert.py .................                            [ 59%]
[2020-10-22T21:39:38.599Z] tests/test_models_bart.py ......                                         [ 59%]
[2020-10-22T21:44:18.743Z] tests/test_models_bert.py ............                                   [ 60%]
[2020-10-22T21:46:00.142Z] tests/test_models_electra.py ........                                    [ 61%]
[2020-10-22T21:49:47.086Z] tests/test_models_gpt2.py .......F                                       [ 61%]
[2020-10-22T21:49:57.226Z] tests/test_models_mobilebert.py .....                                    [ 62%]
[2020-10-22T21:51:27.552Z] tests/test_models_roberta.py ....FF                                      [ 62%]
[2020-10-22T21:52:10.783Z] tests/test_models_transformer.py ....................................... [ 65%]
[2020-10-22T21:53:33.876Z] ........................................................................ [ 71%]
[2020-10-22T21:54:26.540Z] ..........................................FFFFF                          [ 74%]
[2020-10-22T21:54:34.975Z] tests/test_models_transformer_xl.py ......                               [ 75%]
[2020-10-22T21:55:47.820Z] tests/test_models_xlmr.py .FF                                            [ 75%]
[2020-10-22T21:55:48.122Z] tests/test_op.py ....................................................... [ 79%]
[2020-10-22T21:55:48.754Z] ........................................................................ [ 85%]
[2020-10-22T21:55:49.195Z] ....                                                                     [ 85%]
[2020-10-22T21:56:20.712Z] tests/test_optimizer.py .                                                [ 85%]
[2020-10-22T21:56:20.716Z] tests/test_pytest.py .                                                   [ 85%]
[2020-10-22T21:56:21.005Z] tests/test_sequence_sampler.py ......................................... [ 89%]
[2020-10-22T21:56:21.522Z] ........................................................................ [ 94%]
[2020-10-22T21:56:33.345Z] .......................................                                  [ 97%]
[2020-10-22T21:56:33.590Z] Fatal Python error: Aborted
[2020-10-22T21:56:33.590Z] Thread 0x00007f92b9fff700 (most recent call first):
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/threading.py", line 299 in wait
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/threading.py", line 551 in wait
[2020-10-22T21:56:33.590Z]   File "/usr/local/lib/python3.6/dist-packages/tqdm/_monitor.py", line 59 in run
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/threading.py", line 916 in _bootstrap_inner
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/threading.py", line 884 in _bootstrap
[2020-10-22T21:56:33.590Z] Current thread 0x00007f9457153740 (most recent call first):
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 66 in _launch
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 19 in __init__
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/multiprocessing/context.py", line 277 in _Popen
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/multiprocessing/process.py", line 105 in start
[2020-10-22T21:56:33.590Z]   File "/usr/lib/python3.6/multiprocessing/pool.py", line 239 in _repopulate_pool
[2020-10-22T21:56:33.591Z]   File "/usr/lib/python3.6/multiprocessing/pool.py", line 174 in __init__
[2020-10-22T21:56:33.591Z]   File "/usr/lib/python3.6/multiprocessing/context.py", line 119 in Pool
[2020-10-22T21:56:33.591Z]   File "/workspace/gluon-nlp/tests/test_utils_misc.py", line 87 in verify_download
[2020-10-22T21:56:33.591Z]   File "/workspace/gluon-nlp/tests/test_utils_misc.py", line 102 in test_download_s3
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/python.py", line 184 in pytest_pyfunc_call
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 87 in <lambda>
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 93 in _hookexec
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/hooks.py", line 286 in __call__
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/python.py", line 1627 in runtest
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 163 in pytest_runtest_call
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
[2020-10-22T21:56:33.591Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 87 in <lambda>
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 93 in _hookexec
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/hooks.py", line 286 in __call__
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 256 in <lambda>
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 310 in from_call
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 256 in call_runtest_hook
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 216 in call_and_report
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 127 in runtestprotocol
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/runner.py", line 110 in pytest_runtest_protocol
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 87 in <lambda>
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 93 in _hookexec
[2020-10-22T21:56:33.592Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/hooks.py", line 286 in __call__
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/main.py", line 338 in pytest_runtestloop
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 87 in <lambda>
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 93 in _hookexec
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/hooks.py", line 286 in __call__
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/main.py", line 313 in _main
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/main.py", line 257 in wrap_session
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/main.py", line 306 in pytest_cmdline_main
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/callers.py", line 187 in _multicall
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 87 in <lambda>
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/manager.py", line 93 in _hookexec
[2020-10-22T21:56:33.593Z]   File "/root/.local/lib/python3.6/site-packages/pluggy/hooks.py", line 286 in __call__
[2020-10-22T21:56:33.594Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/config/__init__.py", line 165 in main
[2020-10-22T21:56:33.594Z]   File "/root/.local/lib/python3.6/site-packages/_pytest/config/__init__.py", line 187 in console_main
[2020-10-22T21:56:33.594Z]   File "/root/.local/lib/python3.6/site-packages/pytest/__main__.py", line 5 in <module>
[2020-10-22T21:56:33.594Z]   File "/usr/lib/python3.6/runpy.py", line 85 in _run_code
[2020-10-22T21:56:33.594Z]   File "/usr/lib/python3.6/runpy.py", line 193 in _run_module_as_main
[2020-10-22T22:00:07.664Z] ./gluon_nlp_job.sh: line 39:    44 Aborted                 (core dumped) /bin/bash -o pipefail -c "$COMMAND"

To Reproduce

Compute Environment: 
Instance type: g4dn.4x
vCPUs: 16

run reproduce.sh

reproduce.sh

#!/bin/bash
python3 -m pip install -U --quiet --pre "mxnet-cu102==2.0.0b20201022" -f https://dist.mxnet.io/python
git clone https://github.com/dmlc/gluon-nlp; cd gluon-nlp
git checkout master
python3 -m pip install --quiet -e .[extras]
python3 -m pytest --cov=. --cov-config=./.coveragerc --cov-report=xml --durations=50 --device="gpu" --runslow ./tests/

$ chmod +x reproduce.sh
$ ./reproduce.sh

What have you tried to solve it?

Some observations:

The failed tests all use mx.npx.waitall()
The test failed on multiprocessing.Pool()
After bisect by commits between nightly build mxnet-cu102==2.0.0b20201016 and mxnet-cu102==2.0.0b20201022, I find the first bad commit is Remove cleanup on side threads #19378

Environment

We recommend using our script for collecting the diagnostic information with the following command
curl --retry 10 -s https://raw.githubusercontent.com/apache/incubator-mxnet/master/tools/diagnose.py | python3

Environment Information

[2020-10-27T16:59:27.002Z] ----------Python Info----------
[2020-10-27T16:59:27.002Z] Version      : 3.6.9
[2020-10-27T16:59:27.002Z] Compiler     : GCC 8.4.0
[2020-10-27T16:59:27.002Z] Build        : ('default', 'Oct  8 2020 12:12:24')
[2020-10-27T16:59:27.003Z] Arch         : ('64bit', '')
[2020-10-27T16:59:27.003Z] ------------Pip Info-----------
[2020-10-27T16:59:27.004Z] Version      : 20.2.4
[2020-10-27T16:59:27.004Z] Directory    : /usr/local/lib/python3.6/dist-packages/pip
[2020-10-27T16:59:27.004Z] ----------MXNet Info-----------
[2020-10-27T16:59:28.271Z] Version      : 2.0.0
[2020-10-27T16:59:28.271Z] Directory    : /root/.local/lib/python3.6/site-packages/mxnet
[2020-10-27T16:59:28.271Z] Commit hash file "/root/.local/lib/python3.6/site-packages/mxnet/COMMIT_HASH" not found. Not installed from pre-built package or built from source.
[2020-10-27T16:59:28.271Z] Library      : ['/root/.local/lib/python3.6/site-packages/mxnet/libmxnet.so']
[2020-10-27T16:59:28.271Z] Build features:
[2020-10-27T16:59:28.271Z] ✔ CUDA
[2020-10-27T16:59:28.271Z] ✔ CUDNN
[2020-10-27T16:59:28.271Z] ✖ NCCL
[2020-10-27T16:59:28.271Z] ✖ TENSORRT
[2020-10-27T16:59:28.271Z] ✖ CUTENSOR
[2020-10-27T16:59:28.271Z] ✔ CPU_SSE
[2020-10-27T16:59:28.271Z] ✔ CPU_SSE2
[2020-10-27T16:59:28.271Z] ✔ CPU_SSE3
[2020-10-27T16:59:28.271Z] ✖ CPU_SSE4_1
[2020-10-27T16:59:28.271Z] ✖ CPU_SSE4_2
[2020-10-27T16:59:28.271Z] ✖ CPU_SSE4A
[2020-10-27T16:59:28.271Z] ✖ CPU_AVX
[2020-10-27T16:59:28.271Z] ✖ CPU_AVX2
[2020-10-27T16:59:28.271Z] ✔ OPENMP
[2020-10-27T16:59:28.271Z] ✖ SSE
[2020-10-27T16:59:28.271Z] ✖ F16C
[2020-10-27T16:59:28.271Z] ✖ JEMALLOC
[2020-10-27T16:59:28.271Z] ✔ BLAS_OPEN
[2020-10-27T16:59:28.271Z] ✖ BLAS_ATLAS
[2020-10-27T16:59:28.271Z] ✖ BLAS_MKL
[2020-10-27T16:59:28.271Z] ✖ BLAS_APPLE
[2020-10-27T16:59:28.271Z] ✔ LAPACK
[2020-10-27T16:59:28.271Z] ✔ MKLDNN
[2020-10-27T16:59:28.271Z] ✔ OPENCV
[2020-10-27T16:59:28.271Z] ✔ DIST_KVSTORE
[2020-10-27T16:59:28.271Z] ✖ INT64_TENSOR_SIZE
[2020-10-27T16:59:28.271Z] ✔ SIGNAL_HANDLER
[2020-10-27T16:59:28.271Z] ✖ DEBUG
[2020-10-27T16:59:28.271Z] ✖ TVM_OP
[2020-10-27T16:59:28.271Z] ----------System Info----------
[2020-10-27T16:59:28.272Z] Platform     : Linux-4.14.186-146.268.amzn2.x86_64-x86_64-with-Ubuntu-18.04-bionic
[2020-10-27T16:59:28.272Z] system       : Linux
[2020-10-27T16:59:28.272Z] node         : ip-10-20-91-122.ec2.internal
[2020-10-27T16:59:28.272Z] release      : 4.14.186-146.268.amzn2.x86_64
[2020-10-27T16:59:28.272Z] version      : #1 SMP Tue Jul 14 18:16:52 UTC 2020
[2020-10-27T16:59:28.272Z] ----------Hardware Info----------
[2020-10-27T16:59:28.272Z] machine      : x86_64
[2020-10-27T16:59:28.272Z] processor    : x86_64
[2020-10-27T16:59:28.297Z] Architecture:        x86_64
[2020-10-27T16:59:28.297Z] CPU op-mode(s):      32-bit, 64-bit
[2020-10-27T16:59:28.297Z] Byte Order:          Little Endian
[2020-10-27T16:59:28.297Z] CPU(s):              16
[2020-10-27T16:59:28.297Z] On-line CPU(s) list: 0-15
[2020-10-27T16:59:28.297Z] Thread(s) per core:  2
[2020-10-27T16:59:28.297Z] Core(s) per socket:  8
[2020-10-27T16:59:28.297Z] Socket(s):           1
[2020-10-27T16:59:28.297Z] NUMA node(s):        1
[2020-10-27T16:59:28.297Z] Vendor ID:           GenuineIntel
[2020-10-27T16:59:28.297Z] CPU family:          6
[2020-10-27T16:59:28.297Z] Model:               85
[2020-10-27T16:59:28.297Z] Model name:          Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
[2020-10-27T16:59:28.297Z] Stepping:            7
[2020-10-27T16:59:28.297Z] CPU MHz:             3103.458
[2020-10-27T16:59:28.297Z] BogoMIPS:            4999.99
[2020-10-27T16:59:28.297Z] Hypervisor vendor:   KVM
[2020-10-27T16:59:28.297Z] Virtualization type: full
[2020-10-27T16:59:28.297Z] L1d cache:           32K
[2020-10-27T16:59:28.297Z] L1i cache:           32K
[2020-10-27T16:59:28.297Z] L2 cache:            1024K
[2020-10-27T16:59:28.297Z] L3 cache:            36608K
[2020-10-27T16:59:28.297Z] NUMA node0 CPU(s):   0-15
[2020-10-27T16:59:28.297Z] Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke avx512_vnni
[2020-10-27T16:59:28.298Z] ----------Network Test----------
[2020-10-27T16:59:28.298Z] Setting timeout: 10
[2020-10-27T16:59:28.766Z] Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0007 sec, LOAD: 0.4678 sec.
[2020-10-27T16:59:29.018Z] Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0861 sec, LOAD: 0.1656 sec.
[2020-10-27T16:59:29.168Z] Error open Gluon Tutorial(cn): https://zh.gluon.ai, <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)>, DNS finished in 0.11675071716308594 sec.
[2020-10-27T16:59:29.307Z] Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0076 sec, LOAD: 0.1308 sec.
[2020-10-27T16:59:29.489Z] Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0034 sec, LOAD: 0.1785 sec.
[2020-10-27T16:59:29.564Z] Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.02842235565185547 sec.
[2020-10-27T16:59:29.564Z] ----------Environment----------

The text was updated successfully, but these errors were encountered:

barry-jin · 2020-11-04T18:10:21Z

Update

To Reproduce

It is able to reproduce this error by running a small set of tests.

python3 -m pip install -U --quiet --pre "mxnet-cu102==2.0.0b20201022" -f https://dist.mxnet.io/python
git clone https://github.com/dmlc/gluon-nlp; cd gluon-nlp
git checkout master
python3 -m pip install --quiet -e .[extras]
python3 -m pytest --device='gpu' --verbose --runslow tests/test_models.py tests/test_models_albert.py tests/test_models_bart.py tests/test_models_bert.py

Error Message

Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1362441855 to reproduce.
============================== test session starts ===============================
platform linux -- Python 3.6.9, pytest-6.1.2, py-1.9.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /workspace/gluon-nlp, configfile: pytest.ini
plugins: cov-2.10.1
collected 95 items                                                               

tests/test_models.py::test_list_backbone_names PASSED                      [  1%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_base_v2] PASSED [  2%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_large_v2] PASSED [  3%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xlarge_v2] PASSED [  4%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xxlarge_v2] PASSED [  5%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_base] PASSED [  6%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_large] PASSED [  7%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_wwm_large] PASSED [  8%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_base] PASSED [  9%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_large] PASSED [ 10%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 11%]
tests/test_models.py::test_get_backbone[ctx0-google_multi_cased_bert_base] PASSED [ 12%]
tests/test_models.py::test_get_backbone[ctx0-google_zh_bert_base] PASSED   [ 13%]
tests/test_models.py::test_get_backbone[ctx0-gluon_electra_small_owt] PASSED [ 14%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_base] PASSED   [ 15%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_large] PASSED  [ 16%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_small] PASSED  [ 17%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_124M] PASSED             [ 18%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_1558M] PASSED            [ 20%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_355M] PASSED             [ 21%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_774M] PASSED             [ 22%]
tests/test_models.py::test_get_backbone[ctx0-google_uncased_mobilebert] PASSED [ 23%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_roberta_base] PASSED  [ 24%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_roberta_large] PASSED [ 25%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_xlmr_base] PASSED     [ 26%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_xlmr_large] PASSED    [ 27%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_bart_base] PASSED     [ 28%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_bart_large] PASSED    [ 29%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_albert_base_v2] PASSED [ 30%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_en_cased_bert_base] PASSED [ 31%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_electra_small] PASSED [ 32%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-fairseq_bart_base] PASSED [ 33%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_albert_base_v2] PASSED [ 34%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_en_cased_bert_base] PASSED [ 35%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_electra_small] PASSED [ 36%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-fairseq_bart_base] PASSED [ 37%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_albert_base_v2] PASSED [ 38%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_en_cased_bert_base] PASSED [ 40%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_electra_small] PASSED [ 41%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-fairseq_bart_base] PASSED [ 42%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_albert_base_v2] PASSED [ 43%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_en_cased_bert_base] PASSED [ 44%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_electra_small] PASSED [ 45%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-fairseq_bart_base] PASSED [ 46%]
tests/test_models_albert.py::test_albert_backbone[auto-False-False] PASSED [ 47%]
tests/test_models_albert.py::test_albert_backbone[auto-True-True] PASSED   [ 48%]
tests/test_models_albert.py::test_albert_backbone[NT-False-False] PASSED   [ 49%]
tests/test_models_albert.py::test_albert_backbone[NT-True-True] PASSED     [ 50%]
tests/test_models_albert.py::test_albert_backbone[TN-False-False] PASSED   [ 51%]
tests/test_models_albert.py::test_albert_backbone[TN-True-True] PASSED     [ 52%]
tests/test_models_albert.py::test_albert_for_mlm_model[auto] PASSED        [ 53%]
tests/test_models_albert.py::test_albert_for_mlm_model[NT] PASSED          [ 54%]
tests/test_models_albert.py::test_albert_for_mlm_model[TN] PASSED          [ 55%]
tests/test_models_albert.py::test_albert_for_pretrain_model[auto] PASSED   [ 56%]
tests/test_models_albert.py::test_albert_for_pretrain_model[NT] PASSED     [ 57%]
tests/test_models_albert.py::test_albert_for_pretrain_model[TN] PASSED     [ 58%]
tests/test_models_albert.py::test_list_pretrained_albert PASSED            [ 60%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_base_v2] PASSED [ 61%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_large_v2] PASSED [ 62%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_xlarge_v2] PASSED [ 63%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_xxlarge_v2] PASSED [ 64%]
tests/test_models_bart.py::test_list_pretrained_bart PASSED                [ 65%]
tests/test_models_bart.py::test_bart[fairseq_bart_base] PASSED             [ 66%]
tests/test_models_bart.py::test_bart[fairseq_bart_large] PASSED            [ 67%]
tests/test_models_bart.py::test_bart_cfg_registry PASSED                   [ 68%]
tests/test_models_bart.py::test_bart_cfg[bart_base] PASSED                 [ 69%]
tests/test_models_bart.py::test_bart_cfg[bart_large] PASSED                [ 70%]
tests/test_models_bert.py::test_list_pretrained_bert PASSED                [ 71%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-auto] PASSED           [ 72%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-NT] PASSED             [ 73%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-TN] PASSED             [ 74%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_base] PASSED [ 75%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_large] PASSED [ 76%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_wwm_large] PASSED [ 77%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_base] PASSED [ 78%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_large] PASSED [ 80%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 81%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_multi_cased_bert_base] PASSED [ 82%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_zh_bert_base] PASSED [ 83%]
tests/test_models_electra.py::test_list_pretrained_electra PASSED          [ 84%]
tests/test_models_electra.py::test_electra_model[ctx0-auto] PASSED         [ 85%]
tests/test_models_electra.py::test_electra_model[ctx0-NT] PASSED           [ 86%]
tests/test_models_electra.py::test_electra_model[ctx0-TN] PASSED           [ 87%]
tests/test_models_electra.py::test_electra_get_pretrained[ctx0-gluon_electra_small_owt] PASSED [ 88%]
tests/test_models_electra.py::test_electra_get_pretrained[ctx0-google_electra_base] PASSED [ 89%]
tests/test_models_electra.py::test_electra_get_pretrained[ctx0-google_electra_large] PASSED [ 90%]
tests/test_models_electra.py::test_electra_get_pretrained[ctx0-google_electra_small] PASSED [ 91%]
tests/test_models_gpt2.py::test_list_pretrained_gpt2 PASSED                [ 92%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-auto] PASSED        [ 93%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-TN] PASSED          [ 94%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-NT] PASSED          [ 95%]
tests/test_models_gpt2.py::test_gpt2_incremental_states[ctx0] PASSED       [ 96%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_124M] PASSED                [ 97%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_355M] PASSED                                                        [ 98%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_774M] FAILED                                                                    [100%]

============================================================== FAILURES ==============================================================
_____________________________________________________ test_gpt2[ctx0-gpt2_774M] ______________________________________________________

model_name = 'gpt2_774M', ctx = gpu(0)

    @pytest.mark.slow
    @pytest.mark.remote_required
    @pytest.mark.parametrize('model_name', ['gpt2_124M', 'gpt2_355M', 'gpt2_774M'])
    def test_gpt2(model_name, ctx):
        # test from pretrained
        assert len(list_pretrained_gpt2()) > 0
        with tempfile.TemporaryDirectory() as root, ctx:
            cfg, tokenizer, params_path, lm_params_path =\
                get_pretrained_gpt2(model_name, load_backbone=True, load_lm=True, root=root)
            assert cfg.MODEL.vocab_size == len(tokenizer.vocab)
            # test backbone
            gpt2_model = GPT2Model.from_cfg(cfg)
            gpt2_model.load_parameters(params_path)
            # test lm model
            gpt2_lm_model = GPT2ForLM(cfg)
            gpt2_lm_model.load_parameters(lm_params_path)
    
            # test forward
            batch_size = 3
            seq_length = 32
            vocab_size = len(tokenizer.vocab)
            input_ids = mx.np.array(
                np.random.randint(
                    2,
                    vocab_size,
                    (batch_size, seq_length)
                ),
                dtype=np.int32,
                ctx=ctx
            )
            logits, _ = gpt2_lm_model(
                input_ids,
                gpt2_lm_model.init_states(batch_size, ctx)
            )
>           mx.npx.waitall()

tests/test_models_gpt2.py:142: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/local/lib/python3.6/dist-packages/mxnet/ndarray/ndarray.py:240: in waitall
    check_call(_LIB.MXNDArrayWaitAll())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ret = -1

    def check_call(ret):
        """Check the return value of C API call.
    
        This function will raise an exception when an error occurs.
        Wrap every API call with this function.
    
        Parameters
        ----------
        ret : int
            return value from API calls.
        """
        if ret != 0:
>           raise get_last_ffi_error()
E           mxnet.base.MXNetError: Traceback (most recent call last):
E             File "../src/storage/./pooled_storage_manager.h", line 192
E           MXNetError: Memory allocation failed out of memory

/usr/local/lib/python3.6/dist-packages/mxnet/base.py:246: MXNetError
-------------------------------------------------------- Captured stdout call --------------------------------------------------------
Downloading /tmp/tmpbj080s2v/gpt2_774M/gpt2-9dc62091.vocab from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/gpt2-9dc62091.vocab...
Downloading /tmp/tmpbj080s2v/gpt2_774M/gpt2-396d4d8e.merges from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/gpt2-396d4d8e.merges...
Downloading /tmp/tmpbj080s2v/gpt2_774M/model-9917e24e.params from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/model-9917e24e.params...
Downloading /tmp/tmpbj080s2v/gpt2_774M/model_lm-cfbfa641.params from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/model_lm-cfbfa641.params...
-------------------------------------------------------- Captured stderr call --------------------------------------------------------
100%|██████████| 558k/558k [00:00<00:00, 7.15MiB/s]
100%|██████████| 456k/456k [00:00<00:00, 6.39MiB/s]
100%|██████████| 3.10G/3.10G [01:16<00:00, 40.5MiB/s]
100%|██████████| 3.10G/3.10G [01:20<00:00, 38.6MiB/s]
========================================================== warnings summary ==========================================================
src/gluonnlp/attention_cell.py:715
  /workspace/gluon-nlp/src/gluonnlp/attention_cell.py:715: DeprecationWarning: invalid escape sequence \s
    """

src/gluonnlp/op.py:226
  /workspace/gluon-nlp/src/gluonnlp/op.py:226: DeprecationWarning: invalid escape sequence \p
    """

tests/test_models_albert.py: 6 warnings
tests/test_models_bart.py: 2 warnings
tests/test_models_bert.py: 3 warnings
tests/test_models_gpt2.py: 3 warnings
  /usr/local/lib/python3.6/dist-packages/mxnet/gluon/block.py:572: UserWarning: Parameter 'weight' is already initialized, ignoring. Set force_reinit=True to re-initialize.
    v.initialize(None, ctx, init, force_reinit=force_reinit)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
====================================================== short test summary info =======================================================
FAILED tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_774M] - mxnet.base.MXNetError: Traceback (most recent call last):
======================================= 1 failed, 94 passed, 16 warnings in 1990.67s (0:33:10) =======================================

Possible memory leak.

There is possible GPU memory leak when running test_models.py::test_tvm_integration on 10.22 nightly release.

python3 -m pytest --device='gpu' --verbose --runslow tests/test_models.py::test_tvm_integration

barry-jin · 2020-12-14T19:40:33Z

Here are the logs before and after reverting #19378

Before Revert

root@6a1ad75b3392:/workspace/incubator-mxnet# git log -1
commit 43750c8bfed6ca91fc47fd1fa6d620197e26c84c (HEAD)
Author: Przemyslaw Tredak <[email protected]>
Date:   Wed Oct 21 11:50:12 2020 -0700

    Remove cleanup on side threads (#19378)
    
    * Remove cleanup on side threads
    
    * removed comment
root@6a1ad75b3392:/workspace/incubator-mxnet# cd ../gluon-nlp/ ; python3 -m pytest --device='gpu' --verbose --runslow tests/test_models.py tests/test_models_albert.py tests/test_models_bart.py tests/test_models_bert.py tests/test_models_gpt2.py
Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1033001789 to reproduce.
=================================== test session starts ====================================
platform linux -- Python 3.6.9, pytest-6.1.2, py-1.9.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /workspace/gluon-nlp, configfile: pytest.ini
plugins: cov-2.10.1
collected 87 items                                                                         

tests/test_models.py::test_list_backbone_names PASSED                                [  1%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_base_v2] PASSED           [  2%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_large_v2] PASSED          [  3%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xlarge_v2] PASSED         [  4%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xxlarge_v2] PASSED        [  5%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_base] PASSED       [  6%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_large] PASSED      [  8%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_wwm_large] PASSED  [  9%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_base] PASSED     [ 10%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_large] PASSED    [ 11%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 12%]
tests/test_models.py::test_get_backbone[ctx0-google_multi_cased_bert_base] PASSED    [ 13%]
tests/test_models.py::test_get_backbone[ctx0-google_zh_bert_base] PASSED             [ 14%]
tests/test_models.py::test_get_backbone[ctx0-gluon_electra_small_owt] PASSED         [ 16%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_base] PASSED             [ 17%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_large] PASSED            [ 18%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_small] PASSED            [ 19%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_124M] PASSED                       [ 20%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_1558M] PASSED                      [ 21%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_355M] PASSED                       [ 22%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_774M] PASSED                       [ 24%]
tests/test_models.py::test_get_backbone[ctx0-google_uncased_mobilebert] PASSED       [ 25%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_roberta_base] PASSED            [ 26%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_roberta_large] PASSED           [ 27%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_xlmr_base] PASSED               [ 28%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_xlmr_large] PASSED              [ 29%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_bart_base] PASSED               [ 31%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_bart_large] PASSED              [ 32%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_albert_base_v2] PASSED [ 33%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_en_cased_bert_base] PASSED [ 34%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_electra_small] PASSED  [ 35%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-fairseq_bart_base] PASSED     [ 36%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_albert_base_v2] PASSED [ 37%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_en_cased_bert_base] PASSED [ 39%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_electra_small] PASSED  [ 40%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-fairseq_bart_base] PASSED     [ 41%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_albert_base_v2] PASSED [ 42%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_en_cased_bert_base] PASSED [ 43%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_electra_small] PASSED  [ 44%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-fairseq_bart_base] PASSED     [ 45%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_albert_base_v2] PASSED [ 47%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_en_cased_bert_base] PASSED [ 48%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_electra_small] PASSED  [ 49%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-fairseq_bart_base] PASSED     [ 50%]
tests/test_models_albert.py::test_albert_backbone[auto-False-False] PASSED           [ 51%]
tests/test_models_albert.py::test_albert_backbone[auto-True-True] PASSED             [ 52%]
tests/test_models_albert.py::test_albert_backbone[NT-False-False] PASSED             [ 54%]
tests/test_models_albert.py::test_albert_backbone[NT-True-True] PASSED               [ 55%]
tests/test_models_albert.py::test_albert_backbone[TN-False-False] PASSED             [ 56%]
tests/test_models_albert.py::test_albert_backbone[TN-True-True] PASSED               [ 57%]
tests/test_models_albert.py::test_albert_for_mlm_model[auto] PASSED                  [ 58%]
tests/test_models_albert.py::test_albert_for_mlm_model[NT] PASSED                    [ 59%]
tests/test_models_albert.py::test_albert_for_mlm_model[TN] PASSED                    [ 60%]
tests/test_models_albert.py::test_albert_for_pretrain_model[auto] PASSED             [ 62%]
tests/test_models_albert.py::test_albert_for_pretrain_model[NT] PASSED               [ 63%]
tests/test_models_albert.py::test_albert_for_pretrain_model[TN] PASSED               [ 64%]
tests/test_models_albert.py::test_list_pretrained_albert PASSED                      [ 65%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_base_v2] PASSED [ 66%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_large_v2] PASSED [ 67%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_xlarge_v2] PASSED [ 68%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_xxlarge_v2] PASSED [ 70%]
tests/test_models_bart.py::test_list_pretrained_bart PASSED                          [ 71%]
tests/test_models_bart.py::test_bart[fairseq_bart_base] PASSED                       [ 72%]
tests/test_models_bart.py::test_bart[fairseq_bart_large] PASSED                      [ 73%]
tests/test_models_bart.py::test_bart_cfg_registry PASSED                             [ 74%]
tests/test_models_bart.py::test_bart_cfg[bart_base] PASSED                           [ 75%]
tests/test_models_bart.py::test_bart_cfg[bart_large] PASSED                          [ 77%]
tests/test_models_bert.py::test_list_pretrained_bert PASSED                          [ 78%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-auto] PASSED                     [ 79%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-NT] PASSED                       [ 80%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-TN] PASSED                       [ 81%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_base] PASSED [ 82%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_large] PASSED [ 83%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_wwm_large] PASSED [ 85%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_base] PASSED [ 86%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_large] PASSED [ 87%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 88%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_multi_cased_bert_base] PASSED [ 89%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_zh_bert_base] PASSED [ 90%]
tests/test_models_gpt2.py::test_list_pretrained_gpt2 PASSED                          [ 91%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-auto] PASSED                  [ 93%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-TN] PASSED                    [ 94%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-NT] PASSED                    [ 95%]
tests/test_models_gpt2.py::test_gpt2_incremental_states[ctx0] PASSED                 [ 96%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_124M] PASSED                          [ 97%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_355M] PASSED                          [ 98%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_774M] FAILED                          [100%]

========================================= FAILURES =========================================
________________________________ test_gpt2[ctx0-gpt2_774M] _________________________________

model_name = 'gpt2_774M', ctx = gpu(0)

    @pytest.mark.slow
    @pytest.mark.remote_required
    @pytest.mark.parametrize('model_name', ['gpt2_124M', 'gpt2_355M', 'gpt2_774M'])
    def test_gpt2(model_name, ctx):
        # test from pretrained
        assert len(list_pretrained_gpt2()) > 0
        with tempfile.TemporaryDirectory() as root, ctx:
            cfg, tokenizer, params_path, lm_params_path =\
                get_pretrained_gpt2(model_name, load_backbone=True, load_lm=True, root=root)
            assert cfg.MODEL.vocab_size == len(tokenizer.vocab)
            # test backbone
            gpt2_model = GPT2Model.from_cfg(cfg)
            gpt2_model.load_parameters(params_path)
            # test lm model
            gpt2_lm_model = GPT2ForLM(cfg)
            gpt2_lm_model.load_parameters(lm_params_path)
    
            # test forward
            batch_size = 3
            seq_length = 32
            vocab_size = len(tokenizer.vocab)
            input_ids = mx.np.array(
                np.random.randint(
                    2,
                    vocab_size,
                    (batch_size, seq_length)
                ),
                dtype=np.int32,
                ctx=ctx
            )
            logits, _ = gpt2_lm_model(
                input_ids,
                gpt2_lm_model.init_states(batch_size, ctx)
            )
>           mx.npx.waitall()

tests/test_models_gpt2.py:142: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../incubator-mxnet/python/mxnet/ndarray/ndarray.py:240: in waitall
    check_call(_LIB.MXNDArrayWaitAll())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ret = -1

    def check_call(ret):
        """Check the return value of C API call.
    
        This function will raise an exception when an error occurs.
        Wrap every API call with this function.
    
        Parameters
        ----------
        ret : int
            return value from API calls.
        """
        if ret != 0:
>           raise get_last_ffi_error()
E           mxnet.base.MXNetError: Traceback (most recent call last):
E             File "../src/storage/./pooled_storage_manager.h", line 192
E           MXNetError: Memory allocation failed out of memory

../incubator-mxnet/python/mxnet/base.py:246: MXNetError
----------------------------------- Captured stdout call -----------------------------------
Downloading /tmp/tmpzxj5da72/gpt2_774M/gpt2-9dc62091.vocab from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/gpt2-9dc62091.vocab...
Downloading /tmp/tmpzxj5da72/gpt2_774M/gpt2-396d4d8e.merges from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/gpt2-396d4d8e.merges...
Downloading /tmp/tmpzxj5da72/gpt2_774M/model-9917e24e.params from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/model-9917e24e.params...
Downloading /tmp/tmpzxj5da72/gpt2_774M/model_lm-cfbfa641.params from https://gluonnlp-numpy-data.s3-accelerate.amazonaws.com/models/gpt2_774M/model_lm-cfbfa641.params...
----------------------------------- Captured stderr call -----------------------------------
100%|██████████| 558k/558k [00:00<00:00, 3.45MiB/s]
100%|██████████| 456k/456k [00:00<00:00, 4.16MiB/s]
100%|██████████| 3.10G/3.10G [01:07<00:00, 45.9MiB/s]
100%|██████████| 3.10G/3.10G [01:18<00:00, 39.4MiB/s]
===================================== warnings summary =====================================
../incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py:67
  /workspace/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py:67: DeprecationWarning: invalid escape sequence \(
    tuple_re = re.compile('\([0-9L|,| ]+\)')

src/gluonnlp/attention_cell.py:715
  /workspace/gluon-nlp/src/gluonnlp/attention_cell.py:715: DeprecationWarning: invalid escape sequence \s
    """

src/gluonnlp/op.py:226
  /workspace/gluon-nlp/src/gluonnlp/op.py:226: DeprecationWarning: invalid escape sequence \p
    """

tests/test_models_albert.py: 6 warnings
tests/test_models_bart.py: 2 warnings
tests/test_models_bert.py: 3 warnings
tests/test_models_gpt2.py: 3 warnings
  /workspace/incubator-mxnet/python/mxnet/gluon/block.py:572: UserWarning: Parameter 'weight' is already initialized, ignoring. Set force_reinit=True to re-initialize.
    v.initialize(None, ctx, init, force_reinit=force_reinit)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
================================= short test summary info ==================================
FAILED tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_774M] - mxnet.base.MXNetError: Trac...
================== 1 failed, 86 passed, 17 warnings in 1718.22s (0:28:38) ==================
root@6a1ad75b3392:/workspace/gluon-nlp#

After Revert

root@6a1ad75b3392:/workspace/incubator-mxnet# git log -1
commit d786518725ebfdfceeea7b09d3ecb8edf6bbbfaa (HEAD)
Author: barry-jin <[email protected]>
Date:   Tue Dec 8 21:42:28 2020 +0000

    Revert "Remove cleanup on side threads (#19378)"
    
    This reverts commit 43750c8bfed6ca91fc47fd1fa6d620197e26c84c.
root@6a1ad75b3392:/workspace/incubator-mxnet# cd ../gluon-nlp/ ; python3 -m pytest --device='gpu' --verbose --runslow tests/test_models.py tests/test_models_albert.py tests/test_models_bart.py tests/test_models_bert.py tests/test_models_gpt2.py
Setting module np/mx/python random seeds, use MXNET_MODULE_SEED=1725596454 to reproduce.
=================================== test session starts ====================================
platform linux -- Python 3.6.9, pytest-6.1.2, py-1.9.0, pluggy-0.13.1 -- /usr/bin/python3
cachedir: .pytest_cache
rootdir: /workspace/gluon-nlp, configfile: pytest.ini
plugins: cov-2.10.1
collected 87 items                                                                         

tests/test_models.py::test_list_backbone_names PASSED                                [  1%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_base_v2] PASSED           [  2%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_large_v2] PASSED          [  3%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xlarge_v2] PASSED         [  4%]
tests/test_models.py::test_get_backbone[ctx0-google_albert_xxlarge_v2] PASSED        [  5%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_base] PASSED       [  6%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_large] PASSED      [  8%]
tests/test_models.py::test_get_backbone[ctx0-google_en_cased_bert_wwm_large] PASSED  [  9%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_base] PASSED     [ 10%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_large] PASSED    [ 11%]
tests/test_models.py::test_get_backbone[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 12%]
tests/test_models.py::test_get_backbone[ctx0-google_multi_cased_bert_base] PASSED    [ 13%]
tests/test_models.py::test_get_backbone[ctx0-google_zh_bert_base] PASSED             [ 14%]
tests/test_models.py::test_get_backbone[ctx0-gluon_electra_small_owt] PASSED         [ 16%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_base] PASSED             [ 17%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_large] PASSED            [ 18%]
tests/test_models.py::test_get_backbone[ctx0-google_electra_small] PASSED            [ 19%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_124M] PASSED                       [ 20%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_1558M] PASSED                      [ 21%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_355M] PASSED                       [ 22%]
tests/test_models.py::test_get_backbone[ctx0-gpt2_774M] PASSED                       [ 24%]
tests/test_models.py::test_get_backbone[ctx0-google_uncased_mobilebert] PASSED       [ 25%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_roberta_base] PASSED            [ 26%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_roberta_large] PASSED           [ 27%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_xlmr_base] PASSED               [ 28%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_xlmr_large] PASSED              [ 29%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_bart_base] PASSED               [ 31%]
tests/test_models.py::test_get_backbone[ctx0-fairseq_bart_large] PASSED              [ 32%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_albert_base_v2] PASSED [ 33%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_en_cased_bert_base] PASSED [ 34%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-google_electra_small] PASSED  [ 35%]
tests/test_models.py::test_tvm_integration[ctx0-NT-2-4-fairseq_bart_base] PASSED     [ 36%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_albert_base_v2] PASSED [ 37%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_en_cased_bert_base] PASSED [ 39%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-google_electra_small] PASSED  [ 40%]
tests/test_models.py::test_tvm_integration[ctx0-NT-1-4-fairseq_bart_base] PASSED     [ 41%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_albert_base_v2] PASSED [ 42%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_en_cased_bert_base] PASSED [ 43%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-google_electra_small] PASSED  [ 44%]
tests/test_models.py::test_tvm_integration[ctx0-TN-2-4-fairseq_bart_base] PASSED     [ 45%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_albert_base_v2] PASSED [ 47%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_en_cased_bert_base] PASSED [ 48%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-google_electra_small] PASSED  [ 49%]
tests/test_models.py::test_tvm_integration[ctx0-TN-1-4-fairseq_bart_base] PASSED     [ 50%]
tests/test_models_albert.py::test_albert_backbone[auto-False-False] PASSED           [ 51%]
tests/test_models_albert.py::test_albert_backbone[auto-True-True] PASSED             [ 52%]
tests/test_models_albert.py::test_albert_backbone[NT-False-False] PASSED             [ 54%]
tests/test_models_albert.py::test_albert_backbone[NT-True-True] PASSED               [ 55%]
tests/test_models_albert.py::test_albert_backbone[TN-False-False] PASSED             [ 56%]
tests/test_models_albert.py::test_albert_backbone[TN-True-True] PASSED               [ 57%]
tests/test_models_albert.py::test_albert_for_mlm_model[auto] PASSED                  [ 58%]
tests/test_models_albert.py::test_albert_for_mlm_model[NT] PASSED                    [ 59%]
tests/test_models_albert.py::test_albert_for_mlm_model[TN] PASSED                    [ 60%]
tests/test_models_albert.py::test_albert_for_pretrain_model[auto] PASSED             [ 62%]
tests/test_models_albert.py::test_albert_for_pretrain_model[NT] PASSED               [ 63%]
tests/test_models_albert.py::test_albert_for_pretrain_model[TN] PASSED               [ 64%]
tests/test_models_albert.py::test_list_pretrained_albert PASSED                      [ 65%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_base_v2] PASSED [ 66%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_large_v2] PASSED [ 67%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_xlarge_v2] PASSED [ 68%]
tests/test_models_albert.py::test_albert_get_pretrained[google_albert_xxlarge_v2] PASSED [ 70%]
tests/test_models_bart.py::test_list_pretrained_bart PASSED                          [ 71%]
tests/test_models_bart.py::test_bart[fairseq_bart_base] PASSED                       [ 72%]
tests/test_models_bart.py::test_bart[fairseq_bart_large] PASSED                      [ 73%]
tests/test_models_bart.py::test_bart_cfg_registry PASSED                             [ 74%]
tests/test_models_bart.py::test_bart_cfg[bart_base] PASSED                           [ 75%]
tests/test_models_bart.py::test_bart_cfg[bart_large] PASSED                          [ 77%]
tests/test_models_bert.py::test_list_pretrained_bert PASSED                          [ 78%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-auto] PASSED                     [ 79%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-NT] PASSED                       [ 80%]
tests/test_models_bert.py::test_bert_small_cfg[ctx0-TN] PASSED                       [ 81%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_base] PASSED [ 82%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_large] PASSED [ 83%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_cased_bert_wwm_large] PASSED [ 85%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_base] PASSED [ 86%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_large] PASSED [ 87%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_en_uncased_bert_wwm_large] PASSED [ 88%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_multi_cased_bert_base] PASSED [ 89%]
tests/test_models_bert.py::test_bert_get_pretrained[ctx0-google_zh_bert_base] PASSED [ 90%]
tests/test_models_gpt2.py::test_list_pretrained_gpt2 PASSED                          [ 91%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-auto] PASSED                  [ 93%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-TN] PASSED                    [ 94%]
tests/test_models_gpt2.py::test_gpt2_small_config[ctx0-NT] PASSED                    [ 95%]
tests/test_models_gpt2.py::test_gpt2_incremental_states[ctx0] PASSED                 [ 96%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_124M] PASSED                          [ 97%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_355M] PASSED                          [ 98%]
tests/test_models_gpt2.py::test_gpt2[ctx0-gpt2_774M] PASSED                          [100%]

===================================== warnings summary =====================================
../incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py:67
  /workspace/incubator-mxnet/python/mxnet/contrib/onnx/mx2onnx/_op_translations.py:67: DeprecationWarning: invalid escape sequence \(
    tuple_re = re.compile('\([0-9L|,| ]+\)')

src/gluonnlp/attention_cell.py:715
  /workspace/gluon-nlp/src/gluonnlp/attention_cell.py:715: DeprecationWarning: invalid escape sequence \s
    """

src/gluonnlp/op.py:226
  /workspace/gluon-nlp/src/gluonnlp/op.py:226: DeprecationWarning: invalid escape sequence \p
    """

tests/test_models_albert.py: 6 warnings
tests/test_models_bart.py: 2 warnings
tests/test_models_bert.py: 3 warnings
tests/test_models_gpt2.py: 3 warnings
  /workspace/incubator-mxnet/python/mxnet/gluon/block.py:572: UserWarning: Parameter 'weight' is already initialized, ignoring. Set force_reinit=True to re-initialize.
    v.initialize(None, ctx, init, force_reinit=force_reinit)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
======================= 87 passed, 17 warnings in 1928.37s (0:32:08) =======================
root@6a1ad75b3392:/workspace/gluon-nlp#

andrei5055 · 2021-01-19T18:25:20Z

@barry-jin : To investigate this problem I need to compile MxNet locally. Do you know what set of cmake options I need to use for that?

barry-jin · 2021-01-19T18:39:45Z

From my experience, I just used following commands to build MxNet locally and reproduce the issue:

$ git clone --recursive https://github.com/apache/incubator-mxnet
$ cd incubator-mxnet
$ git checkout 43750c8bfed6ca91fc47fd1fa6d620197e26c84c
$ cp config/linux_gpu.cmake config.cmake
$ mkdir build; cd build
$ cmake -GNinja -DCMAKE_BUILD_TYPE=Debug ..; ninja
$ cd ..
$ python3 -m pip install --user -e ./python
$ cd ~/workspace
$ git clone https://github.com/dmlc/gluon-nlp
$ cd ~/workspace/gluon-nlp
$ git checkout 8c8b0c9cda0853caa88fdbf4e0544986fbef243c
$ python3 -m pip install --quiet -e .[extras]
$ python3 -m pytest --device='gpu' --verbose --runslow tests/test_models.py tests/test_models_albert.py tests/test_models_bart.py tests/test_models_bert.py tests/test_models_gpt2.py

andrei5055 · 2021-01-19T21:53:02Z

Thanks a lot for the script! Unfortunately, I am having a linking problem:

root@28b3a2b8de7a:/opt/mxnet/build# ninja
[1/3] Linking CXX shared library libmxnet.so
FAILED: libmxnet.so 
. . .
Error copying file "/opt/mxnet/build/3rdparty/mkldnn/include/dnnl_config.h" to "/opt/mxnet/include/mkldnn/".
ninja: build stopped: subcommand failed.

The file dnnl_config.h is not presented in any part of incubator-mxnet

barry-jin · 2021-01-19T23:04:13Z

You may try to update 3rdparty modules

$ git clean -ffxd
$ git submodule update --init --recursive

andrei5055 · 2021-01-20T21:17:44Z

@barry-jin : Is it true, that the script you gave me should reproduce this problem? I tried, and I don't see it:
==== 71 passed, 16 skipped, 17 warnings in 1528.46s (0:25:28) ====
Just in case... The 16 tests were skipped, because "JVM is not supported". I'm not sure if a memory problem will show up in one of these tests.

barry-jin · 2021-01-20T23:52:12Z

@andrei5055 Thanks for your investigation. I think the warning message should be "TVM is not supported". You can follow tvm documentation to install tvm. Alternatively, I will provide test suite without tvm support that will reproduce this issue.

barry-jin · 2021-01-21T04:48:31Z

You can checkout gluon-nlp to dmlc/gluon-nlp@7910d6d and run following test suite.

git checkout 7910d6d247ec9cb1b51cd49d79e3d474b087b188
python3 -m pytest --device='gpu' --verbose --runslow tests/test_attention_cell.py tests/test_data_batchify.py tests/test_data_filtering.py tests/test_data_sampler.py tests/test_data_tokenizers.py tests/test_embedding.py tests/test_gluon_block.py tests/test_initializer.py tests/test_layers.py tests/test_loss.py tests/test_models.py tests/test_models_albert.py tests/test_models_bart.py tests/test_models_bert.py tests/test_models_electra.py tests/test_models_gpt2.py tests/test_models_roberta.py tests/test_models_transformer.py

andrei5055 · 2021-01-22T19:21:56Z

@barry-jin: Still cannot reproduce this problem:
========== 933 passed, 847 warnings in 2932.28s (0:48:52) =======

BTW, all warnings are of following two types:
Type 1:

  /opt/mxnet/python/mxnet/gluon/block.py:1098: UserWarning: Parameter 0b7a2e74_c816_4146_bbb2_7973d2ca9112_gamma, 0af6619c_7075_430a_9226_8458e6ca733a_bias, c75fe6d3_81e7_4748_9894_f49abf4b5f2a_bias, 53661f2f_d20f_4c90_a539_173394b859d3_weight, 2b4ce060_94a7_4cd1_ac29_4bdc41789888_weight, e19ccd3d_cc61_44b2_ab1a_20e88f571877_bias, 8f53b519_069f_415a_bd05_c8b4ec58dd24_const, 99d015d6_eeca_4ad6_9fc6_1fb55e43b0f7_weight, 711c0a20_91e2_43c3_ba41_48f5fd2a3398_gamma, d852d48d_ca52_408a_83f3_2c11bf3a01b8_beta, e0417d39_d73a_4101_a440_f992b45a176e_weight, 3f5329d5_0903_448a_8c7a_65536aa507a1_bias, d08c8d34_3bca_4006_9843_aa5d069767cf_beta is not used by any computation. Is this intended?
    self._build_cache(*args)

Type 2:

  /opt/mxnet/python/mxnet/registry.py:108: UserWarning: New initializer mxnet.gluon.parameter.Init registered with name constant_140658119590520 isoverriding existing initializer mxnet.gluon.parameter.Init
    register(klass, name)

barry-jin added Bug needs triage labels Oct 23, 2020

barry-jin mentioned this issue Oct 27, 2020

[WIP] Revert "Remove cleanup on side threads (#19378)" #19432

Closed

6 tasks

barry-jin changed the title ~~[BUG] Fatal Python error when running GluonNLP pytest on MXNet linux nightly build~~ Memory allocation failed out of memory Nov 4, 2020

ann-qin-lu mentioned this issue Mar 19, 2022

GPU memory leak when using gluon.data.DataLoader with num_workers>0 #20959

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory allocation failed out of memory #19420

Memory allocation failed out of memory #19420

barry-jin commented Oct 23, 2020 •

edited

Loading

barry-jin commented Nov 4, 2020

barry-jin commented Dec 14, 2020

andrei5055 commented Jan 19, 2021

barry-jin commented Jan 19, 2021 •

edited

Loading

andrei5055 commented Jan 19, 2021

barry-jin commented Jan 19, 2021 •

edited

Loading

andrei5055 commented Jan 20, 2021

barry-jin commented Jan 20, 2021

barry-jin commented Jan 21, 2021

andrei5055 commented Jan 22, 2021

Memory allocation failed out of memory #19420

Memory allocation failed out of memory #19420

Comments

barry-jin commented Oct 23, 2020 • edited Loading

Description

Error Message

To Reproduce

What have you tried to solve it?

Environment

barry-jin commented Nov 4, 2020

Update

To Reproduce

Possible memory leak.

barry-jin commented Dec 14, 2020

andrei5055 commented Jan 19, 2021

barry-jin commented Jan 19, 2021 • edited Loading

andrei5055 commented Jan 19, 2021

barry-jin commented Jan 19, 2021 • edited Loading

andrei5055 commented Jan 20, 2021

barry-jin commented Jan 20, 2021

barry-jin commented Jan 21, 2021

andrei5055 commented Jan 22, 2021

barry-jin commented Oct 23, 2020 •

edited

Loading

barry-jin commented Jan 19, 2021 •

edited

Loading

barry-jin commented Jan 19, 2021 •

edited

Loading