有其他提供免费代理的网站在这里说下，我添加到项目里

@jhao104 你好。这边我看有人提供了几个墙外的代理网址，似乎都不错。可以抽空添加一下吗？（我自己没搞定，有的做了反爬，有的浏览器能打开，但是request连不上····）
谢谢了。
http://free-proxy.cz/zh/proxylist/country/US/https/ping/all
http://www.gatherproxy.com
http://proxydb.net/?protocol=https&anonlvl=4

目前代理墙外的代理网址只有3个，能抓到的太少了

顺便请教一下，为什么li浏览器可以打开，但是requests连不上

J_hao104 · Answer 9 · Tue Nov 19 2019 17:14:39 GMT+0800 (China Standard Time)

@dota2heqiuzhi #385 (comment)

Dino Dannard · Answer 10 · Tue Nov 19 2019 17:18:41 GMT+0800 (China Standard Time)

@jhao104 那你有时间添加这些网址吗？
没空我就自己琢磨了···

J_hao104 · Answer 11 · Tue Nov 19 2019 17:22:27 GMT+0800 (China Standard Time)

@jhao104 那你有时间添加这些网址吗？
没空我就自己琢磨了···

墙外的你可以自己先搞

Dino Dannard · Answer 12 · Thu Nov 21 2019 14:53:32 GMT+0800 (China Standard Time)

@jhao104 那你有时间添加这些网址吗？
没空我就自己琢磨了···

墙外的你可以自己先搞

这两个网址都做了反爬···搞不定。
大佬空了可以搞定一个，我学习学习？
http://free-proxy.cz/zh/proxylist/country/US/https/ping/all
http://proxydb.net/?protocol=https&anonlvl=4

@jhao104

J_hao104 · Answer 13 · Mon Nov 25 2019 12:29:37 GMT+0800 (China Standard Time)

@jhao104 那你有时间添加这些网址吗？
没空我就自己琢磨了···

墙外的你可以自己先搞

这两个网址都做了反爬···搞不定。
大佬空了可以搞定一个，我学习学习？
http://free-proxy.cz/zh/proxylist/country/US/https/ping/all
http://proxydb.net/?protocol=https&anonlvl=4

@jhao104

就是js动态生成的，你把这段j s扣出来用pyv8或者pyexecjs执行就能拿到了

Dino Dannard · Answer 14 · Mon Nov 25 2019 12:33:04 GMT+0800 (China Standard Time)

我主要是不会js，只会一点python，当时用 pyexecjs试了一会没搞出来😂 空了我再试试，谢谢！ J_hao104 <notifications@github.com> 于2019年11月25日周一下午12:29写道：

…

@jhao104 <https://github.com/jhao104> 那你有时间添加这些网址吗？没空我就自己琢磨了··· 墙外的你可以自己先搞这两个网址都做了反爬···搞不定。大佬空了可以搞定一个，我学习学习？ http://free-proxy.cz/zh/proxylist/country/US/https/ping/all http://proxydb.net/?protocol=https&anonlvl=4 @jhao104 <https://github.com/jhao104> [image: image] <https://user-images.githubusercontent.com/15058920/69512436-f9d4f300-0f7e-11ea-8710-be649443a79c.png> [image: image] <https://user-images.githubusercontent.com/15058920/69512474-24bf4700-0f7f-11ea-8a46-142b7c9197d2.png> 就是js动态生成的，你把这段j s扣出来用pyv8或者pyexecjs执行就能拿到了 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#71?email_source=notifications&email_token=AIUSUJANS6QA4BH7CP5I6NTQVNIDFA5CNFSM4D34MZLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFBCKLQ#issuecomment-557983022>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIUSUJFTYOXHWMW6FBQYA23QVNIDFANCNFSM4D34MZLA> .

1yzz · Answer 15 · Thu Dec 05 2019 11:17:58 GMT+0800 (China Standard Time)

http://proxydb.net/?protocol=https&anonlvl=4

    @staticmethod
    def proxyDBNet():
        urls = [
            'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CN',
            'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=',
            'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=SG',
            'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=US',
            'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CZ',
            'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=AR',
        ]
        request = WebRequest()

        for url in urls:
            r = request.get(url, timeout=20)
            proxies = re.findall(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d+)', r.text)
            for proxy in proxies:
                yield proxy

Dino Dannard · Answer 16 · Thu Dec 05 2019 11:20:25 GMT+0800 (China Standard Time)

这个网站也做了反爬的（js)。你的爬取逻辑应该抓不到数据吧 1yzz <notifications@github.com> 于2019年12月5日周四上午11:18写道：

…

http://proxydb.net/?protocol=https&anonlvl=4 @staticmethod def proxyDBNet(): urls = [ 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CN', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=SG', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=US', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CZ', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=AR', ] request = WebRequest() for url in urls: r = request.get(url, timeout=20) proxies = re.findall(r'(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d+)', r.text) for proxy in proxies: yield proxy — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#71?email_source=notifications&email_token=AIUSUJDU3ONTHN5BM2KNKZLQXBXGRA5CNFSM4D34MZLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF7KUXQ#issuecomment-561949278>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIUSUJHR4KAZ7HOACHQYOS3QXBXGRANCNFSM4D34MZLA> .

1yzz · Answer 17 · Thu Dec 05 2019 12:06:50 GMT+0800 (China Standard Time)

这个网站也做了反爬的（js)。你的爬取逻辑应该抓不到数据吧 1yzz notifications@github.com 于2019年12月5日周四上午11:18写道：
…
http://proxydb.net/?protocol=https&anonlvl=4 @staticmethod def proxyDBNet(): urls = [ 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CN', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=SG', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=US', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CZ', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=AR', ] request = WebRequest() for url in urls: r = request.get(url, timeout=20) proxies = re.findall(r'(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}:\d+)', r.text) for proxy in proxies: yield proxy — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#71?email_source=notifications&email_token=AIUSUJDU3ONTHN5BM2KNKZLQXBXGRA5CNFSM4D34MZLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF7KUXQ#issuecomment-561949278>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIUSUJHR4KAZ7HOACHQYOS3QXBXGRANCNFSM4D34MZLA .

 <td>
                        <script>
                            var  q =
                             '32.5.301'.split('').reverse().join('');
                            var yxy = /* */ atob('\x4d\x69\x34\x78\x4e\x44\x59\x3d'.replace(/\\x([0-9A-Fa-f]{2})/g,function(){return String.fromCharCode(parseInt(arguments[1], 16))}));
                            var  pp =  (8080 - ([]+[]))/**//**/ +  (+document.querySelector('[data-rnnumg]').getAttribute('data-rnnumg'))-[]+[];
                            document.write('<a href="/' + q + yxy + '/' + pp + '#http">' + q + yxy + String.fromCharCode(58) + pp + '</a>');
                        </script>
                    </td>

找到这个元素，script里面的内容定义一个函数，pyv8执行一下。document.querySelector('[data-rnnumg]').getAttribute('data-rnnumg') 这个值也能在DOM里面找到，可以解析DOM树，替换内容。

1yzz · Answer 18 · Thu Dec 05 2019 12:08:43 GMT+0800 (China Standard Time)

这个网站也做了反爬的（js)。你的爬取逻辑应该抓不到数据吧 1yzz notifications@github.com 于2019年12月5日周四上午11:18写道：
…
http://proxydb.net/?protocol=https&anonlvl=4 @staticmethod def proxyDBNet(): urls = [ 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CN', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=SG', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=US', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=CZ', 'http://proxydb.net/?protocol=https&anonlvl=4&min_uptime=75&max_response_time=5&country=AR', ] request = WebRequest() for url in urls: r = request.get(url, timeout=20) proxies = re.findall(r'(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}:\d+)', r.text) for proxy in proxies: yield proxy — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#71?email_source=notifications&email_token=AIUSUJDU3ONTHN5BM2KNKZLQXBXGRA5CNFSM4D34MZLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEF7KUXQ#issuecomment-561949278>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIUSUJHR4KAZ7HOACHQYOS3QXBXGRANCNFSM4D34MZLA .

https://github.com/scrapinghub/splash 有个这个东西，html丢过去就完事了。但是不确定会不会影响爬虫效率/。

hanjackcyw · Answer 19 · Thu Jan 16 2020 17:33:49 GMT+0800 (China Standard Time)

我看好多人要这个网站的代理，我刚好才爬过，贴一下代码如下，需要安装scrapy包, 主要是我用scrapy用习惯了，当然用其它各种包做xpath解析也行。

    @staticmethod
    def freeProxy21():
        url = 'http://free-proxy.cz/en/proxylist'

        request = WebRequest()
        r = request.get(url, timeout=10)

        sel = scrapy.Selector(text=r.text)

        max_page = max([int(v) for v in sel.xpath('//div[@class="paginator"]/a/text()').extract() if v.isdigit()])
        print(max_page)

        for page in range(1, max_page + 1):
            r = request.get(url+'/main/{}'.format(page), timeout=10)

            sel = scrapy.Selector(text=r.text)

            proxies = sel.xpath('//table[@id="proxy_list"]/tbody/tr/td/script[contains(text(),"decode")]/text()').extract()
            ports = sel.xpath('//table[@id="proxy_list"]/tbody/tr/td/span/text()').extract()

            for index, value in enumerate(proxies):
                try:
                    proxy_ip = re.search('.*decode\(\"(.*)\"\)', value).group(1)
                    if proxy_ip:
                        proxy = '{}:{}'.format(base64.b64decode(proxy_ip).decode('utf-8'), ports[index])
                        yield proxy
                except Exception as e:
                    pass

Hai Liang W. · Answer 20 · Sat Mar 07 2020 22:35:51 GMT+0800 (China Standard Time)

@hanjackcyw scrapy会带来很大体积，如果只是为了使用 Selector可以用Scrapy底层的库。

https://parsel.readthedocs.io/en/latest/

jet · Answer 21 · Fri Sep 11 2020 16:57:37 GMT+0800 (China Standard Time)

好像这个代理也不错：https://proxy.mimvp.com/freeopen

lyon · Answer 22 · Mon Dec 21 2020 12:27:07 GMT+0800 (China Standard Time)

https://www.feizhuip.com/News-getInfo-id-1307.html 这个也许不错

TophTab · Answer 23 · Wed Dec 23 2020 09:31:20 GMT+0800 (China Standard Time)

可以看看这个，蜻蜓的免费
https://proxy.horocn.com/free-china-proxy/all.html?page=lr&max_id=2N

另外大佬，用docker搭在云服务器上，命令里的redis是改成自己的吗？

jwdeaa · Answer 24 · Wed Feb 17 2021 00:32:51 GMT+0800 (China Standard Time)

A new proxy list: http://pzzqz.com/

J_hao104 · Answer 25 · Fri Apr 02 2021 14:44:01 GMT+0800 (China Standard Time)

A new proxy list: http://pzzqz.com/

已添加

gavin · Answer 26 · Tue Aug 10 2021 22:00:51 GMT+0800 (China Standard Time)

https://zhimahttp.com/?utm-source=bdtg&utm-keyword=?400359
芝麻免费代理

J_hao104 · Answer 27 · Mon Dec 27 2021 09:53:41 GMT+0800 (China Standard Time)

https://zhimahttp.com/?utm-source=bdtg&utm-keyword=?400359 芝麻免费代理

他这个免费的代码很挫，更新时间都很久了

julianghttp · Answer 28 · Mon May 23 2022 09:19:25 GMT+0800 (China Standard Time)

http://www.juliangip.com/api?ref=proxy_pool
巨量ip免费代理

xswwxx · Answer 29 · Sat Dec 31 2022 16:25:51 GMT+0800 (China Standard Time)

https://openproxylist.xyz/http.txt 这种的添加模式要怎么弄

Miku · Answer 30 · Tue May 30 2023 11:07:14 GMT+0800 (China Standard Time)

https://openproxylist.xyz/http.txt 这种的添加模式要怎么弄

@staticmethod
def freeProxy17():
    urls = [
        'https://openproxylist.xyz/http.txt',
        'http://pubproxy.com/api/proxy?limit=3&format=txt&http=true&type=https',
        'https://www.proxy-list.download/api/v1/get?type=https',
        'https://raw.githubusercontent.com/shiftytr/proxy-list/master/proxy.txt'
    ]
    request = WebRequest()
    for url in urls:
        r = request.get(url, timeout=20)
        for proxy in r.text.split('\n'):
            if proxy:
                yield proxy

djme0 · Answer 31 · Thu Jun 22 2023 19:33:15 GMT+0800 (China Standard Time)

https://uu-proxy.com/
https://www.proxyscan.io/
http://www.kxdaili.com/dailiip.html
https://www.xsdaili.cn/

xiumao-cat · Answer 32 · Mon Dec 04 2023 12:11:42 GMT+0800 (China Standard Time)

https://ip.uqidata.com/free/index.html
https://www.69ip.cn/?page=3
https://proxy.ip3366.net/free/
https://www.binglx.cn