Index page gets always scanned, even if its not in the provided urls
BennyAlex opened this issue · comments
Describe the bug
When given an list of urls, eg:
[ 'https://www.guetersloh.de/de/index.php',
'https://www.guetersloh.de/de/datenschutz.php',
'https://www.guetersloh.de/de/index.php#anchorContent',
'https://www.guetersloh.de/de/leben-in-guetersloh.php',
'https://www.guetersloh.de/de/leben-in-guetersloh/ehrenamt.php' ],
It stills scanns the "/" page and I get an additional sixth report
"requestedUrl": "https://www.guetersloh.de/",
"finalUrl": "https://www.guetersloh.de/",
D Creating Unlighthouse Unlighthouse 19:27:03
D Setting Unlighthouse Site URL [Site: https://www.guetersloh.de] Unlighthouse 19:27:04
starting unlighthouse
D Starting Unlighthouse [Server: undefined Site: https://www.guetersloh.de Debug: true] Unlighthouse 19:27:04
i The url config has been provided with 5 paths for scanning. Disabling sitemap, sampling and crawler. Unlighthouse 19:27:04
D Route has been queued. Path: / Name: _index. Unlighthouse 19:27:04
D Route has been queued. Path: /de/datenschutz.php Name: de-slug. Unlighthouse 19:27:04
D Route has been queued. Path: /de/index.php Name: de-slug. Unlighthouse 19:27:04
D Route has been queued. Path: /de/leben-in-guetersloh.php Name: de-slug. Unlighthouse 19:27:04
D Route has been queued. Path: /de/leben-in-guetersloh/ehrenamt.php Name: de-leben-in-guetersloh-slug.
Reproduction
No response
System / Nuxt Info
No response
Try using:
scanner: {
// exclude specific routes
exclude: [
'^https:\/\/www\.guetersloh\.de\/$'
]
}