struct.error and OverflowError
minhlab opened this issue · comments
I am running cort-train
when these errors happen. My setup is Ubuntu 16.04.3, 64G RAM, 4 CPUs.
Process ForkPoolWorker-9:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 125, in worker
put((job, i, result))
File "/usr/lib/python3.5/multiprocessing/queues.py", line 355, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.5/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 130, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.5/multiprocessing/queues.py", line 349, in put
obj = ForkingPickler.dumps(obj)
File "/usr/lib/python3.5/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
OverflowError: cannot serialize a string larger than 4GiB
Process ForkPoolWorker-10:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 125, in worker
put((job, i, result))
File "/usr/lib/python3.5/multiprocessing/queues.py", line 355, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.5/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 130, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.5/multiprocessing/queues.py", line 349, in put
obj = ForkingPickler.dumps(obj)
File "/usr/lib/python3.5/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
OverflowError: cannot serialize a string larger than 4GiB
Process ForkPoolWorker-11:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/pool.py", line 125, in worker
put((job, i, result))
File "/usr/lib/python3.5/multiprocessing/queues.py", line 355, in put
self._writer.send_bytes(obj)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "/usr/lib/python3.5/multiprocessing/connection.py", line 393, in _send_bytes
header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 130, in worker
put((job, i, (False, wrapped)))
File "/usr/lib/python3.5/multiprocessing/queues.py", line 349, in put
obj = ForkingPickler.dumps(obj)
File "/usr/lib/python3.5/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
OverflowError: cannot serialize a string larger than 4GiB
This is a multiprocessing error which happens when too much data is passed around. I sometimes encountered it when experimenting with larger feature sets. A quick solution would be to disable multiprocessing. This, however, would vastly increase running time for feature extraction. Is this an option for you?
A more principled solution is a rewrite of the feature extraction code allowing for more efficient feature extraction/combination, but this will take some time.
Hi @smartschat, thanks for answering. For some reason, the error doesn't occur when I run on a different machine: CentOS 7.2.1511, 62G RAM, 32 CPUs. Do you know why the difference?
BTW, is it the size of features of one document that exceeds 4GiB or is it an aggregation of multiple documents?
Unfortunately I do not know why there is a different behavior for these settings. :/
It's the size of all the features for an aggregation of documents. As far as I understood, the following happens: Python divides the data for multiprocessing into chunks, which then get processed. The results for the chunks are passed around using Python's pickle mechanism. If the results for the chunks are too big, the error occurs.