No module named 'selenium.settings'
landowark opened this issue · comments
I have been trying to get started with the tutorial at https://www.geeksforgeeks.org/scraping-javascript-enabled-websites-using-scrapy-selenium/, and as far as I can tell I've followed the examples correctly, but when I run scrapy crawl integratedspider
I get the following error:
Traceback (most recent call last):
File "/opt/anaconda/envs/scrapy_selenium/bin/scrapy", line 8, in
sys.exit(execute())
File "/opt/anaconda/envs/scrapy_selenium/lib/python3.7/site-packages/scrapy/cmdline.py", line 114, in execute
settings = get_project_settings()
File "/opt/anaconda/envs/scrapy_selenium/lib/python3.7/site-packages/scrapy/utils/project.py", line 69, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "/opt/anaconda/envs/scrapy_selenium/lib/python3.7/site-packages/scrapy/settings/init.py", line 287, in setmodule
module = import_module(module)
File "/opt/anaconda/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'selenium.settings'
added to bottom of settings.py
from shutil import which
SELENIUM_DRIVER_NAME = 'chrome'
SELENIUM_DRIVER_EXECUTABLE_PATH = which('chromedriver')
SELENIUM_DRIVER_ARGUMENTS = ['--headless']
DOWNLOADER_MIDDLEWARES = {
'scrapy_selenium.SeleniumMiddleware': 800
integratedspider.py
import scrapy
from scrapy_selenium import SeleniumRequest
class IntegratedspiderSpider(scrapy.Spider):
name = 'integratedspider'
def start_requests(self):
yield SeleniumRequest(
url="https://practice.geeksforgeeks.org/courses/online",
wait_time=3,
screenshot=True,
callback=self.parse,
dont_filter=True
)
def parse(self, response):
# courses make list of all items that came in this xpath
# this xpath is of cards containing courses details
courses = response.xpath('//*[@id ="active-courses-content"]/div/div/div')
# course is each course in the courses list
for course in courses:
# xpath of course name is added in the course path
# text() will scrape text from h4 tag that contains course name
course_name = course.xpath('.//a/div[2]/div/div[2]/h4/text()').get()
# course_name is a string containing \n and extra spaces
# these \n and extra spaces are removed
course_name = course_name.split('\n')[1]
course_name = course_name.strip()
yield {
'course Name': course_name
}
Any help would be appreciated. Thanks
It looks like scrapy cannot load the settings module because it can't find it.
How did you named your module ?
You should not name it selenium
as it's a module name that already exists.
Is the settings.py
file there ?
In the exemple GeeksforGeeks exemple, their module name is scrapyselenium
so the settings module is scrapyselenium.settings
. The settings module is mandatory for scrapy projects.
Okay. I created a new project and renamed everything and it works. Total noob mistake on my part. Sorry. Thanks for your help.