linux下代理服务器下载google图片有问题
yfzmk2013 opened this issue · comments
目前,在linux系统个下,采用代理服务器的方式下载google图片,通过命令行不能够正确运行。
@yfzmk2013 请说具体一点。。。
比如系统环境,如何运行的代码,是否修改过代码,出错的现象是什么,报了什么错。。。
Traceback (most recent call last):
File "image_downloader_google.py", line 100, in
browser="phantomjs")
File "/home/yanhao/project/DengHong_Git/Image-Downloader/crawler.py", line 254, in crawl_image_urls
service_args=phantomjs_args, desired_capabilities=dcap)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/phantomjs/webdriver.py", line 58, in init
desired_capabilities=desired_capabilities)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 92, in init
self.start_session(desired_capabilities, browser_profile)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 179, in start_session
response = self.execute(Command.NEW_SESSION, capabilities)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/errorhandler.py", line 163, in check_response
raise exception_class(value)
selenium.common.exceptions.WebDriverException: Message:
/*
- CSS for Privoxy CGI and script output
- Id: cgi-style.css,v
*/
/*
- General rules: Font, Color, Headings, Margins, Links
*/
body,td,th { font-family: arial, helvetica, helv, sans-serif; }
body { background-color: #ffffff; color: #000000; }
h1 { font-size: 140%; margin: 0px; }
h2 { font-size: 120%; margin: 0px; }
h3 { font-size: 110%; margin: 0px; }
p,pre { margin-left: 15px; }
li { margin: 2px 15px; }
dl { margin: 2px 15px; }
a:link { color: #0000dd; text-decoration: none; }
a:visited { color: #330099; text-decoration: none; }
a:active { color: #3333ff; text-decoration: none; }
/*
- Boxen as Table elements:
*/
td.title { border: solid black 1px; background-color: #dddddd; }
td.box { border: solid black 1px; background-color: #eeeeee; }
td.info { border: solid black 1px; background-color: #ccccff; }
td.warning { border: solid black 1px; background-color: #ffdddd; }
/*
- Special Table Boxen: for nesting, naked container and for
- the Status field in CGI Output:
*/
td.wrapbox { border: solid black 1px; padding: 5px; }
td.container { padding: 0px; }
td.status { border: solid black 1px; background-color: #ff0000; color: #ffffff; font-size: 300%; font-weight: bolder; }
/*
- Same Boxen as s:
*/
div.title { border: solid black 1px; background-color: #dddddd; margin: 20px; padding: 20px; }
div.box { border: solid black 1px; background-color: #eeeeee; margin: 20px; padding: 20px; }
div.info { border: solid black 1px; background-color: #ccccff; margin: 20px; padding: 20px; }
div.warning { border: solid black 1px; background-color: #ffdddd; margin: 20px; padding: 20px; }
div.wrapbox { border: solid black 1px; margin: 20px; padding: 5px; }
/*
- Bold definitions in
- s, grey BG for table headings, transparent (no-bordered) table
*/
dt { font-weight: bold; }
th { background-color: #dddddd; }
table.transparent { border-style: none}
/*
- Special purpose paragraphs: Small for page footers,
- Important for quoting wrong or dangerous examples,
- Whiteframed for the toggle?mini=y CGI
*/
p.small { font-size: 10px; margin: 0px; }
p.important { border: solid black 1px; background-color: #ffdddd; font-weight: bold; padding: 2px; }
p.whiteframed { margin: 5px; padding: 5px; border: solid black 1px; text-align: center; background-color: #eeeeee; }
/*
- Links as buttons:
*/
td.buttons {
padding: 2px;
}
a.cmd, td.indentbuttons a, td.buttons a {
white-space: nowrap;
width: auto;
padding: 2px;
background-color: #dddddd;
color: #000000;
text-decoration: none;
border-top: 1px solid #ffffff;
border-left: 1px solid #ffffff;
border-bottom: 1px solid #000000;
border-right: 1px solid #000000;
}
a.cmd:hover, td.indentbuttons a:hover, td.buttons a:hover {
background-color: #eeeeee;
}
a.cmd:active, td.indentbuttons a:active, td.buttons a:active {
border-top: 1px solid #000000;
border-left: 1px solid #000000;
border-bottom: 1px solid #ffffff;
border-right: 1px solid #ffffff;
}
/*
- Special red emphasis:
*/
em.warning, strong.warning { color: #ff0000 }
/*
- In show-status we use tables directly behind headlines
- and for some reason or another the headlines are set to
- "margin:0" and leave the tables no air to breath.
- A proper fix would be to replace or remove the "margin:0",
- but as this affects every cgi page we do it another time
- and use this workaround until then.
*/
.box table { margin-top: 1em; }
/*
- Let the URL and pattern input fields scale with the browser
- width and try to prevent vertical scroll bars if the width
- is less than 80 characters.
*/
input.url, input.pattern { width: 95%; }
502 |
|
Ubuntu 16.04 系统
我刚测试了一下,可以用的。
你参考我这个调用方式试试呢?
@yfzmk2013
命令 :curl ip.gs
Current IP / 当前 IP: 45.62.105.15
ISP / 运营商: it7.net
City / 城市: Los Angeles California
Country / 国家: United States
IP.GS is now IP.SB, please visit https://ip.sb/ for more IP information, ip.gs will only use for curl purpose. / IP.GS 已更新至 IP.SB 请访问 https://ip.sb/ 获取更多信息, ip.gs 域名仅作 curl 使用
Please join Telegram group https://t.me/sbfans if you have any issues. / 如有问题,请加入 Telegram 群 https://t.me/sbfans
但我ping www.google.com ping不通,我的网页可以上google。不知道你边是不是已经让命令行可以登上google
@yfzmk2013
我确实是在路由器上翻墙的,不过这应该没影响。如果参数给的socks5代理不对,也是不能正常运行的。
并且我以前开发这个程序的时候,也是和你同样的条件下测试的,不会影响。
你说下具体的python版本、使用的库的版本,以及phantomjs的版本,我看看能不能复现出来.
@sczhengyabin
你好
我的调用函数 方式是 :
name_st=‘里皮’
crawled_urls = crawler.crawl_image_urls(keywords=name_st,
engine='Google', max_number=10000,
face_only=False, safe_mode=True,
proxy_type="socks5", proxy="127.0.0.1:1080",
browser="phantomjs")
python 版本 Python 3.5.2 phantomjs 2.1.1
报错:
keywords: 里皮
Number: 10000
Face Only: False
Safe Mode: True
Query URL: https://www.google.com/search?tbm=isch&hl=en&q=%E9%87%8C%E7%9A%AE&safe=on
Traceback (most recent call last):
File "image_downloader_google.py", line 100, in
browser="phantomjs")
File "/home/yanhao/project/DengHong_Git/Image-Downloader/crawler.py", line 254, in crawl_image_urls
service_args=phantomjs_args, desired_capabilities=dcap)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/phantomjs/webdriver.py", line 58, in init
desired_capabilities=desired_capabilities)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 92, in init
self.start_session(desired_capabilities, browser_profile)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 179, in start_session
response = self.execute(Command.NEW_SESSION, capabilities)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/errorhandler.py", line 163, in check_response
raise exception_class(value)
selenium.common.exceptions.WebDriverException: Message:
/*
- CSS for Privoxy CGI and script output
- Id: cgi-style.css,v
*/
/*
- General rules: Font, Color, Headings, Margins, Links
*/
body,td,th { font-family: arial, helvetica, helv, sans-serif; }
body { background-color: #ffffff; color: #000000; }
h1 { font-size: 140%; margin: 0px; }
h2 { font-size: 120%; margin: 0px; }
h3 { font-size: 110%; margin: 0px; }
p,pre { margin-left: 15px; }
li { margin: 2px 15px; }
dl { margin: 2px 15px; }
a:link { color: #0000dd; text-decoration: none; }
a:visited { color: #330099; text-decoration: none; }
a:active { color: #3333ff; text-decoration: none; }
/*
- Boxen as Table elements:
*/
td.title { border: solid black 1px; background-color: #dddddd; }
td.box { border: solid black 1px; background-color: #eeeeee; }
td.info { border: solid black 1px; background-color: #ccccff; }
td.warning { border: solid black 1px; background-color: #ffdddd; }
/*
- Special Table Boxen: for nesting, naked container and for
- the Status field in CGI Output:
*/
td.wrapbox { border: solid black 1px; padding: 5px; }
td.container { padding: 0px; }
td.status { border: solid black 1px; background-color: #ff0000; color: #ffffff; font-size: 300%; font-weight: bolder; }
/*
- Same Boxen as s:
*/
div.title { border: solid black 1px; background-color: #dddddd; margin: 20px; padding: 20px; }
div.box { border: solid black 1px; background-color: #eeeeee; margin: 20px; padding: 20px; }
div.info { border: solid black 1px; background-color: #ccccff; margin: 20px; padding: 20px; }
div.warning { border: solid black 1px; background-color: #ffdddd; margin: 20px; padding: 20px; }
div.wrapbox { border: solid black 1px; margin: 20px; padding: 5px; }
/*
- Bold definitions in
- s, grey BG for table headings, transparent (no-bordered) table
*/
dt { font-weight: bold; }
th { background-color: #dddddd; }
table.transparent { border-style: none}
/*
- Special purpose paragraphs: Small for page footers,
- Important for quoting wrong or dangerous examples,
- Whiteframed for the toggle?mini=y CGI
*/
p.small { font-size: 10px; margin: 0px; }
p.important { border: solid black 1px; background-color: #ffdddd; font-weight: bold; padding: 2px; }
p.whiteframed { margin: 5px; padding: 5px; border: solid black 1px; text-align: center; background-color: #eeeeee; }
/*
- Links as buttons:
*/
td.buttons {
padding: 2px;
}
a.cmd, td.indentbuttons a, td.buttons a {
white-space: nowrap;
width: auto;
padding: 2px;
background-color: #dddddd;
color: #000000;
text-decoration: none;
border-top: 1px solid #ffffff;
border-left: 1px solid #ffffff;
border-bottom: 1px solid #000000;
border-right: 1px solid #000000;
}
a.cmd:hover, td.indentbuttons a:hover, td.buttons a:hover {
background-color: #eeeeee;
}
a.cmd:active, td.indentbuttons a:active, td.buttons a:active {
border-top: 1px solid #000000;
border-left: 1px solid #000000;
border-bottom: 1px solid #ffffff;
border-right: 1px solid #ffffff;
}
/*
- Special red emphasis:
*/
em.warning, strong.warning { color: #ff0000 }
/*
- In show-status we use tables directly behind headlines
- and for some reason or another the headlines are set to
- "margin:0" and leave the tables no air to breath.
- A proper fix would be to replace or remove the "margin:0",
- but as this affects every cgi page we do it another time
- and use this workaround until then.
*/
.box table { margin-top: 1em; }
/*
- Let the URL and pattern input fields scale with the browser
- width and try to prevent vertical scroll bars if the width
- is less than 80 characters.
*/
input.url, input.pattern { width: 95%; }
502 |
|
@yfzmk2013 Sorry,我重新见了一个virtualenv来测试,依然是没问题。搜了一下报错,有可能是代理的问题。
不知道你的SS用啥啥软件,我试过本地的sslocal开的,没问题。路由器上的,也没问题,windows虚拟机里面开的SS代理,也没问题。
我在win10也遇到了相同的问题,连接vpn以后运行代码一直出现selenium.common.exceptions.WebDriverException: Message: 这个错误。弄了很久,起初以为是网页代理,因为我ping不通google。最终发现代理有很多的模式,查了一下区别,将全局代理改为PAC代理后,程序可以正常的运行。