jaeles-project / gospider

Gospider - Fast web spider written in Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Avoid including wrong results caused by html-redirect

DrRek opened this issue · comments

commented

Try to use gospider on a site like https://vivy.com. You'll realize that due to the application "redirect" implementation all the urls are added to the output file just because the server return a 200 code.

Issue:
gospider retrieves false positive urls for applications that implement the redirect not using http codes.

To replicate:
Go to https://vivy.com/made_up_url and notice that the application returns a 200 code even though the content is an html page the redirects to the home page

<html>
<head>
	<meta http-equiv="refresh" content="0; url=https://www.vivy.com/" />
</head>
<body></body>
</html>

Possible way to solve this:
Add an option to specify a regex that, if matched in the response automatically discards the url or marks it in the output file.