Segmentation Fault While Running in Docker

Question

Segmentation Fault While Running in Docker

ShaneOH opened this issue 7 months ago · comments

Hi, when trying to run this on my machine (MacBook Pro M2), everything works fine. However, when trying to run inside Docker I get a seg fault when calling extract_keywords:

>>> from keybert import KeyBERT
>>> kw_model = KeyBERT()
>>> kw_model
<keybert._model.KeyBERT object at 0xffff21b22490>
>>> keywords = kw_model.extract_keywords('test me')

Fatal Python error: Segmentation fault

So instantiating the model actually works fine, but the extract_keywords breaks. Here's some debug output when I run the Python interpreter via python -q -X faulthandler:

>>> from keybert import KeyBERT
>>> kw_model = KeyBERT()
>>> keywords = kw_model.extract_keywords('test me')
Fatal Python error: Segmentation fault

Thread 0x0000ffff2119f1a0 (most recent call first):
  File "/usr/local/lib/python3.11/threading.py", line 331 in wait
  File "/usr/local/lib/python3.11/threading.py", line 629 in wait
  File "/usr/local/lib/python3.11/site-packages/tqdm/_monitor.py", line 60 in run
  File "/usr/local/lib/python3.11/threading.py", line 1045 in _bootstrap_inner
  File "/usr/local/lib/python3.11/threading.py", line 1002 in _bootstrap

Current thread 0x0000ffff81c26020 (most recent call first):
  File "/usr/local/lib/python3.11/site-packages/transformers/activations.py", line 78 in forward
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527 in _call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518 in _wrapped_call_impl
  File "/usr/local/lib/python3.11/site-packages/transformers/models/bert/modeling_bert.py", line 452 in forward
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527 in _call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518 in _wrapped_call_impl
  File "/usr/local/lib/python3.11/site-packages/transformers/models/bert/modeling_bert.py", line 551 in feed_forward_chunk
  File "/usr/local/lib/python3.11/site-packages/transformers/pytorch_utils.py", line 240 in apply_chunking_to_forward
  File "/usr/local/lib/python3.11/site-packages/transformers/models/bert/modeling_bert.py", line 539 in forward
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527 in _call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518 in _wrapped_call_impl
  File "/usr/local/lib/python3.11/site-packages/transformers/models/bert/modeling_bert.py", line 612 in forward
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527 in _call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518 in _wrapped_call_impl
  File "/usr/local/lib/python3.11/site-packages/transformers/models/bert/modeling_bert.py", line 1022 in forward
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527 in _call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518 in _wrapped_call_impl
  File "/usr/local/lib/python3.11/site-packages/sentence_transformers/models/Transformer.py", line 66 in forward
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527 in _call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518 in _wrapped_call_impl
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/container.py", line 215 in forward
  File "/usr/local/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 165 in encode
  File "/usr/local/lib/python3.11/site-packages/keybert/backend/_sentencetransformers.py", line 62 in embed
  File "/usr/local/lib/python3.11/site-packages/keybert/_model.py", line 176 in extract_keywords
  File "<stdin>", line 1 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, sklearn.__check_build._check_build, scipy._lib._ccallback_c, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg.cython_lapack, scipy.linalg._cythonized_array_utils, scipy.linalg._solve_toeplitz, scipy.linalg._decomp_lu_cython, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.linalg._flinalg, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.spatial._ckdtree, scipy._lib.messagestream, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy.optimize._trlib._trlib, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.optimize._direct, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.special.cython_special, scipy.stats._stats, scipy.stats.beta_ufunc, scipy.stats._boost.beta_ufunc, scipy.stats.binom_ufunc, scipy.stats._boost.binom_ufunc, scipy.stats.nbinom_ufunc, scipy.stats._boost.nbinom_ufunc, scipy.stats.hypergeom_ufunc, scipy.stats._boost.hypergeom_ufunc, scipy.stats.ncf_ufunc, scipy.stats._boost.ncf_ufunc, scipy.stats.ncx2_ufunc, scipy.stats._boost.ncx2_ufunc, scipy.stats.nct_ufunc, scipy.stats._boost.nct_ufunc, scipy.stats.skewnorm_ufunc, scipy.stats._boost.skewnorm_ufunc, scipy.stats.invgauss_ufunc, scipy.stats._boost.invgauss_ufunc, scipy.interpolate._fitpack, scipy.interpolate.dfitpack, scipy.interpolate._bspl, scipy.interpolate._ppoly, scipy.interpolate.interpnd, scipy.interpolate._rbfinterp_pythran, scipy.interpolate._rgi_cython, scipy.stats._biasedurn, scipy.stats._levy_stable.levyst, scipy.stats._stats_pythran, scipy._lib._uarray._uarray, scipy.stats._statlib, scipy.stats._sobol, scipy.stats._qmc_cy, scipy.stats._mvn, scipy.stats._rcont.rcont, sklearn.utils._isfinite, sklearn.utils.murmurhash, sklearn.utils._openmp_helpers, sklearn.metrics.cluster._expected_mutual_info_fast, sklearn.utils._logistic_sigmoid, sklearn.utils.sparsefuncs_fast, sklearn.preprocessing._csr_polynomial_expansion, sklearn.preprocessing._target_encoder_fast, sklearn.metrics._dist_metrics, sklearn.metrics._pairwise_distances_reduction._datasets_pair, sklearn.utils._cython_blas, sklearn.metrics._pairwise_distances_reduction._base, sklearn.metrics._pairwise_distances_reduction._middle_term_computer, sklearn.utils._heap, sklearn.utils._sorting, sklearn.metrics._pairwise_distances_reduction._argkmin, sklearn.metrics._pairwise_distances_reduction._argkmin_classmode, sklearn.utils._vector_sentinel, sklearn.metrics._pairwise_distances_reduction._radius_neighbors, sklearn.metrics._pairwise_fast, sklearn.feature_extraction._hashing_fast, torch._C, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, regex._regex, sklearn.utils._random, sklearn.utils._seq_dataset, sklearn.linear_model._cd_fast, sklearn._loss._loss, sklearn.utils.arrayfuncs, sklearn.svm._liblinear, sklearn.svm._libsvm, sklearn.svm._libsvm_sparse, sklearn.utils._weight_vector, sklearn.linear_model._sgd_fast, sklearn.linear_model._sag_fast, scipy.io.matlab._mio_utils, scipy.io.matlab._streams, scipy.io.matlab._mio5_utils, sklearn.datasets._svmlight_format_fast, charset_normalizer.md, yaml._yaml, sentencepiece._sentencepiece, PIL._imaging (total: 163)
Segmentation fault

In my Dockerfile I have FROM python:3.11 which is the same version as my local machine (which again is working fine).

When I run container stats, I can see my MEM LIMIT is around 8gb, and when I run this little test script, memory only rises to around 200MB -- although I see the CPU % spike really high, 100-300%, so I'm not sure if that's what's going on.

Any idea how to continue debugging this?

Maarten Grootendorst · Answer 1 · Thu Oct 19 2023 19:53:59 GMT+0800 (China Standard Time)

Not quite sure what is happening here. I believe it has something to do with your device, namely the MacBook. I remember these kinds of issues popping up in BERTopic when using newer MacBooks. I believe a fix for this might be setting device='mps' when using a SentenceTransformer model.

Shane O'Hanlon · Answer 2 · Thu Oct 19 2023 22:58:49 GMT+0800 (China Standard Time)

Hey @MaartenGr -- thanks for the response. So I tried this inside Docker:

from keybert import KeyBERT
from sentence_transformers import SentenceTransformer

text_input = 'test me'

model = SentenceTransformer(
        "all-MiniLM-L6-v2",
        device="mps"
    )

kw_model = KeyBERT(model)

keywords = kw_model.extract_keywords(text_input, keyphrase_ngram_range=(1, 4))

Which results in this:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/test.py", line 14, in <module>
    keywords = kw_model.extract_keywords(text_input, keyphrase_ngram_range=(1, 4))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/keybert/_model.py", line 176, in extract_keywords
    doc_embeddings = self.model.embed(docs)
                     ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/keybert/backend/_sentencetransformers.py", line 62, in embed
    embeddings = self.embedding_model.encode(documents, show_progress_bar=verbose)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sentence_transformers/SentenceTransformer.py", line 153, in encode
    self.to(device)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1160, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 833, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1158, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: PyTorch is not linked with support for mps devices

Could the issue have something to do with PyTorch? I had seen a similar issue here #146, but in that case he was not able to instantiate the model it seemed like. I can instantiate the model fine, but calling extract_keywords causes the seg fault.

I'm not sure if I need to run any installs other than keybert<1.0 in my requirements.txt? Then my Dockerfile only has python:3.11 + install from requirements.txt

Meanwhile again, all this test code works perfectly fine outside Docker, on my same MacBook from command line, without error. It's only when I try to run it from inside my Docker container. So it seems like perhaps the Docker container could be missing something, but I don't see anything in the KeyBERT docs about needing to install anything other than keybert to run the minimal example I'm trying? But maybe it's something that already happens to be installed on my machine but is not included in the Docker image?

Maarten Grootendorst · Answer 3 · Fri Oct 20 2023 04:05:41 GMT+0800 (China Standard Time)

I believe this is related to pytorch and the python version that you have installed. Could you check whether the versions of pytorch and python between your local and Docker environment are exactly the same?

Shane O'Hanlon · Answer 4 · Fri Oct 20 2023 05:15:55 GMT+0800 (China Standard Time)

@MaartenGr looks like there's a slight difference

Outside Docker:

(venv) shane@Shanes-PC % python --version
Python 3.11.4

(venv) shane@Shanes-PC % python -c "import torch; print(torch.__version__)"
2.0.1

Inside docker:

root@6e4452c64315:/app# python --version
Python 3.11.6

root@6e4452c64315:/app# python -c "import torch; print(torch.__version__)"
2.1.0

So note my requirements.txt did not have torch, only keybert<1.0.

So if I add torch==2.0.1 in my requirements.txt and rebuild the image... it works!

So to recap: Downgrading torch from 2.1.0 to 2.0.1 in my Docker container solved this issue. I double-checked by upgrading back to 2.1.0 and confirm it breaks again with seg fault, downgrade again and it works again.

Interestingly, if I upgrade torch from 2.0.1 to 2.1.0 (via pip install torch) in my venv outside of docker (so on my machine) -- it still works.

Really not sure what's going on there at the core, but if it works it works so I'll just keep pytorch pinned to 2.0.1 for now 😅

Feel free to close this issue as solved, not sure who might want to be looking into whatever's going on with Docker + pytorch in the bigger picture.

Thanks for your responsiveness on this too -- I really appreciate it!

Maarten Grootendorst · Answer 5 · Sat Oct 21 2023 15:34:32 GMT+0800 (China Standard Time)

Glad to hear that you solved the issue! Most likely, it is a result of the torch version (also whether it has cuda or not) that shows the differences. I'll close this for now but if somebody else runs into this issue, I'll make sure to re-open it.