Enable crawl via robots but prevent site indexing via noindex

Question

Enable crawl via robots but prevent site indexing via noindex

ralph-burstsms opened this issue 3 months ago · comments

Details

Google mentioned to properly avoid site indexing, the page must not be blocked via robots.txt. It looks like robots plugin isn't working this way.

See important note here: https://developers.google.com/search/docs/crawling-indexing/block-indexing

Currently, i'm only using

  site: { indexable: process.env.NUXT_SITE_ENV === "production" }, // false

But it renders both the robots.txt

# START nuxt-simple-robots (indexing disabled)
User-agent: *
Disallow: /

# END nuxt-simple-robots

and the noindex tags

<meta name="robots" content="noindex, nofollow">

Am i doing this correctly, or i missed a configuration?

Harlan Wilton · Answer 1 · Mon Apr 01 2024 21:43:41 GMT+0800 (China Standard Time)

Hmm, this seems like you're trying to fix a page that has already been indexed and trying to get it unindexed? I'd suggest using the removal tools.

I think modifying the robots.txt is a bit risky for sites that don't have this issue. I could add an escape hatch config if it helps, otherwise please share some ideas.

Btw you shouldn't need the nuxt.config site config you shared, this is the default behaviour.