QianyanTech / Image-Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

error

daixiangzi opened this issue · comments

raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects, response=resp)

最近百度改了,up主要更新了,
crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header:
headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
}
init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30"
-195 res = requests.get(init_url,proxies=proxies)
+196 res = requests.get(init_url,proxies=proxies,headers=headers)

最近百度改了,up主要更新了,
crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header:
headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
}
init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30"
-195 res = requests.get(init_url,proxies=proxies)
+196 res = requests.get(init_url,proxies=proxies,headers=headers)

unfortunately it does not work for me...

Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.

== 0 out of 0 crawled images urls will be used.```

Ok, there's another line at 215 that needs to be changed. So overall this will work:

-195 res = requests.get(init_url, proxies=proxies)
+195 res = requests.get(init_url, proxies=proxies, headers=headers)

-215 response = requests.get(url, proxies=proxies)
+215 response = requests.get(url, proxies=proxies, headers=headers) 

最近百度改了,up主要更新了,
crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header:
headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
}
init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30"
-195 res = requests.get(init_url,proxies=proxies)
+196 res = requests.get(init_url,proxies=proxies,headers=headers)

unfortunately it does not work for me...

Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.

== 0 out of 0 crawled images urls will be used.```

大佬麻烦问一下你这个改动是加在哪个地方啊?

最近百度改了,up主要更新了,
crawler.py-baidu_get_image_url_using_api-res = requests.get(init_url, proxies=proxies)加个header:
headers = {
'Accept-Encoding': 'gzip, deflate, sdch',
'Accept-Language': 'en-US,en;q=0.8',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
}
init_url="https://image.baidu.com/search/acjson?tn=resultjson_com&ipn=rj&ct=201326592&lm=7&fp=result&ie=utf-8&oe=utf-8&st=-1&word=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&queryWord=%25E7%258E%25A9%25E6%2589%258B%25E6%259C%25BA&face=0&pn=0&rn=30"
-195 res = requests.get(init_url,proxies=proxies)
+196 res = requests.get(init_url,proxies=proxies,headers=headers)

unfortunately it does not work for me...

Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.
Exceeded 30 redirects.

== 0 out of 0 crawled images urls will be used.```

大佬麻烦问一下你这个改动是加在哪个地方啊?

Refer to my earlier add-on, line 215 also need to change

-215 response = requests.get(url, proxies=proxies)
+215 response = requests.get(url, proxies=proxies, headers=headers) 

Fixed in 7013bfd
@ald2004 @mapattacker Thanks for the fix code.