go-rod / rod

A Chrome DevTools Protocol driver for web automation and scraping.

Home Page:https://go-rod.github.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Failed to deserialize params.body - BINDINGS: binary value expected at position 61

hktalent opened this issue · comments

Rod Version:v0.114.8

hj.ContinueRequest(&proto.FetchContinueRequest{})
image

other:

HijackRequests error {-32000 Fetch domain is not enabled }

Please add a valid Rod Version: v0.0.0 to your issue. Current version is v0.116.0

Please fix the format of your markdown:

2 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"]
2 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]
6:1 MD033/no-inline-html Inline HTML [Element: img]
10 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]

generated by check-issue

尝试不加载这些资源,都将导致无法继续后续的,也不知道为什么

ext := filepath.Ext(ctx.Request.URL().Path)
		szT := util.Any2Str(ctx.Request.Req().Header["Content-Type"]) // stylesheet,font, image
		// 阻止请求 strings.Contains(szT, "stylesheet") ||
		if strings.Contains(szT, "stylesheet") || strings.Contains(szT, "font") || strings.Contains(szT, "image") || ext == ".png" || ext == ".jpg" || ext == ".jpeg" || ext == ".gif" || ext == ".css" || ext == ".woff" || ext == ".woff2" || ext == ".ttf" || ext == ".eot" {
			//ctx.Skip = true // ctx.Request.Req().Header.Set("My-Header", "test")
			//ctx.Response.Fail(proto.NetworkErrorReasonAborted)
			return false
		}

抱歉,我看不太懂你描述

1、MustAdd 添加了拦截器,我因为是做爬虫,所以想阻止图片、字体、css文件的加载,始终没有成功
2、我希望在页面加载完了、添加的拦截事件都运行完了得到一个事件,及时关闭page,似乎爬虫处理的页面在
page.MustNavigate(szUrl.Url).MustWaitLoad() 之后立刻关闭page的话,很多拦截器就无法得到运行
我当前是基于延时70秒,但是这不太友好

您可以尝试下爬虫:https://www.alipan.com/s/4iQjiU9KUmG
1、抓取"/adrive/v2/file/list_by_share" 的数据
3、阻止图片、字体、css文件的加载

我不确定:

wait := page.MustWaitNavigation()
	page.MustNavigate(szUrl.Url).MustWaitLoad()
	wait()
我不确定这里关闭page是否合适