binux / pyspider

A Powerful Spider(Web Crawler) System in Python.

Home Page:http://docs.pyspider.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PicklingError when input pyspider

MatrixJia opened this issue · comments

  • pyspider version:
    pyspider-0.3.10
    Python 3.8.1

  • Operating system:
    MAC Catalina V10.15.4

  • Start up command:
    pyspider

Actual behavior

Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/bin/pyspider", line 11, in
load_entry_point('pyspider==0.3.10', 'console_scripts', 'pyspider')()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pyspider/run.py", line 754, in main
cli()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/core.py", line 1236, in invoke
return Command.invoke(self, ctx)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pyspider/run.py", line 165, in cli
ctx.invoke(all)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pyspider/run.py", line 467, in all
threads.append(run_in(ctx.invoke, phantomjs, **phantomjs_config))
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pyspider/libs/utils.py", line 68, in run_in_subprocess
thread.start()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
return Popen(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function cli at 0x10820e0d0>: it's not the same object as pyspider.run.cli

Please help to check how to fix the issue?
Thank you.

I have the same problem

I have the same problem how fix it

me too 😭😭😭😭

maybe the problem is the version of Python, you can try lower version

I am sure about the problem is the version of python.First uninstall your python, Then choose your python version,I have chosen the version of python3.7.5. hah , you will find things be easier

me too , used python 3.9

so sad ,used 3.9 have the same problem

you can use "multiprocessing_on_dill" to replace "multiprocessing".
open file "/Library/Frameworks/Python.framework/Versions/3.x/lib/python3.x/site-packages/pyspider/libs/utils.py". function "run_in_subprocess" change to:

def run_in_subprocess(func, *args, **kwargs):
    """Run function in subprocess, return a Process object"""
    from multiprocessing_on_dill import Process
    thread = Process(target=func, args=args, kwargs=kwargs)
    thread.daemon = True
    thread.start()
    return thread

me too , used python 3.9