chromium cannot determine loading status running in docker
julianreis opened this issue · comments
julianreis commented
I must reinstall my server and now im getting in 80% of the runs the following error.
this is a fresh docker installation, have someone the same Problems?
i crawl "Kleinanzeigen" and "immobilienscout24"
[2024/02/24 15:05:57|chrome_wrapper.py |INFO ]: Initializing Chrome WebDriver for crawler...
[2024/02/24 15:05:58|patcher.py |INFO ]: patching driver executable /root/.local/share/undetected_chromedriver/undetected_chromedriver
[2024/02/24 15:05:59|__init__.py |INFO ]: setting properties for headless
Traceback (most recent call last):
File "/usr/src/app/flathunt.py", line 99, in <module>
main()
File "/usr/src/app/flathunt.py", line 95, in main
launch_flat_hunt(config, heartbeat)
File "/usr/src/app/flathunt.py", line 35, in launch_flat_hunt
hunter.hunt_flats()
File "/usr/src/app/flathunter/hunter.py", line 56, in hunt_flats
for expose in processor_chain.process(self.crawl_for_exposes(max_pages)):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/hunter.py", line 35, in crawl_for_exposes
return chain(*[try_crawl(searcher, url, max_pages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/hunter.py", line 35, in <listcomp>
return chain(*[try_crawl(searcher, url, max_pages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/hunter.py", line 27, in try_crawl
return searcher.crawl(url, max_pages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/abstract_crawler.py", line 151, in crawl
return self.get_results(url, max_pages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/abstract_crawler.py", line 139, in get_results
soup = self.get_page(search_url)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/crawler/kleinanzeigen.py", line 56, in get_page
return self.get_soup_from_url(search_url, driver=self.get_driver())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/backoff/_sync.py", line 105, in retry
ret = target(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/app/flathunter/abstract_crawler.py", line 70, in get_soup_from_url
driver.get(url)
File "/usr/local/lib/python3.11/site-packages/undetected_chromedriver/__init__.py", line 629, in get_wrapped
return orig_get(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/undetected_chromedriver/__init__.py", line 665, in get
return super().get(url)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 356, in get
self.execute(Command.GET, {"url": url})
File "/usr/local/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 347, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash
from unknown error: cannot determine loading status
from tab crashed
(Session info: chrome=122.0.6261.57)
Stacktrace:
#0 0x561cd54b1793 <unknown>
#1 0x561cd51a5017 <unknown>
#2 0x561cd518dca2 <unknown>
#3 0x561cd518d51f <unknown>
#4 0x561cd518c3e9 <unknown>
#5 0x561cd518c274 <unknown>
#6 0x561cd518acf4 <unknown>
#7 0x561cd518b3ff <unknown>
#8 0x561cd519b665 <unknown>
#9 0x561cd51b0d8c <unknown>
#10 0x561cd51b636b <unknown>
#11 0x561cd518ba8e <unknown>
#12 0x561cd51b0986 <unknown>
#13 0x561cd5231131 <unknown>
#14 0x561cd5212173 <unknown>
#15 0x561cd51e32d3 <unknown>
#16 0x561cd51e3c9e <unknown>
#17 0x561cd54758cb <unknown>
#18 0x561cd5479745 <unknown>
#19 0x561cd54622e1 <unknown>
#20 0x561cd547a2d2 <unknown>
#21 0x561cd544617f <unknown>
#22 0x561cd549fdc8 <unknown>
#23 0x561cd549ffc3 <unknown>
#24 0x561cd54b0944 <unknown>
#25 0x7fc668fce134 <unknown>
[2024/02/24 15:06:01|__init__.py |INFO ]: ensuring close