nodejs / build

Better build and test infra for Node.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Don't build/serve website from NGINX

MattIPv4 opened this issue · comments

The website is now served from Vercel, so it is redundant to serve it from NGINX as well. It has also been observed that the version from NGINX that is available at origin.nodejs.org is being indexed in Google, so removing it should resolve this.

It should be called out that the dist, download, docs, api, etc. should still be served from NGINX.

The site build script can be removed, as well as various parts of the NGINX config relating to the serving of Next.js:

# Sets a custom 404 error page for the whole Server region
# we use the Next.js generated 404 page for all the 404's of our Website
# including for binaries, assets and et cetera.
error_page 404 @localized_404;

# This Location directive is the primary Location directive for any request not handled by the
# mutual exclusivity Location directives (the ones started by ^~) and pretty much handles the requests for the Website pages
# as in general all other requests should either not fall here.
location / {
# We rewrite all Website pages ending with a trailing slash, removing the trailing slash
# This is done because our Next.js deployment doesn't use trailingSlash, in other words
# /en/blog actually translates into /en/blog.html within Next.js built (exported) files
# Removing the trailing slash from the request allows us to do an external permanent redirect
# that will that fallback to this same Location block.
# For all Website pages we won't have any single **/index.html, meaning that we don't need to
# test for $uri/
rewrite ^/(.*)/$ /$1 permanent;
# Tries the $uri first and if there's no $uri for that e.g. /en/blog
# it attempts with /en/blog.html, which for the Website it will exist.
# This is basically a rewrite to remove the ".html" extension from our Website pages
# NOTE: By disabling trailingSlash config option on Next.js, less folders need to be created.
# If a file doesn't exist, it attempts to invoke the @english_fallback, as in most of cases
# for the Website, it means that, for example, /es/blog will not exist, but /en/blog exists
# so it attempts to open that page on its English version. Note that @english_fallback
# will only redirect two-letter-code pages to english ones, everything else goes right to 404.
try_files $uri $uri.html @english_fallback;
location ~ \.json$ {
add_header access-control-allow-origin *;
}
}
# This Location is used for handling static Next.js files. As we don't want to log access
# to static directories and also we don't want to log not found requests here
# As this is a static directory that in theory should not change over time, we disable access
# logs and also cache 404's errors as this folder contents change completely on every
# We don't use ^~ as there are other Rewrite directives below that should also be taken into consideration
# before failing the request with a 404 if it doesn't exist
location /static {
access_log off;
log_not_found off;
open_file_cache_errors on;
}
# This Location directy is used to handle Next.js internal _next directory
# As this is an internal directory requested by Next.js itself, we disable access
# logs and also cache 404's errors as this folder contents change completely on every build
# We use ^~ to tell NGINX to not process any other Location directive or Rewrite after this match
location ^~ /_next {
access_log off;
log_not_found off;
open_file_cache_errors on;
}

# When a website 404 occurs, attempt to load the English version of the page
# if the request was for a localised page.
# Also, store the original language of the request if it was localised
# We'll use this language for the 404 in the try_files in @localized_404
location @english_fallback {
# @TODO: Handle Localization Fallback through Next.js SSR as this is a hacky approach and requires
# continuous maintenance of the supported languages
if ($uri ~* ^/(ar|be|ca|de|es|fa|fr|gl|id|it|ja|ka|ko|nl|pt-br|ro|ru|tr|uk|zh-cn|zh-tw)/) {
set $lang $1;
}
rewrite ^/(ar|be|ca|de|es|fa|fr|gl|id|it|ja|ka|ko|nl|pt-br|ro|ru|tr|uk|zh-cn|zh-tw)/(.*)$ /en/$2;
}
# This location directive handles all 404 responses for the server
# If the request was a localised website page, use the requested language
# as set by the @english_fallback location block
# Otherwise, this will fallback to $lang being "en" as defined numerous lines above
location @localized_404 {
# We disable caching of 404 pages as we always want Cloudflare to check if the file now exists
# Some 404s may be caused by the server reaching maximum concurrent file system open() requests
# Disabling cache allows Cloudflare to re-evaluate the same $uri once our server recovers and then properly cache it
add_header Cache-Control "private, no-store, max-age=0" always;
# If this was a rewritten i18n request from @english_fallback, use the localized 404
# If there is no 404 page for that locale, fallback to the English 404
# As a last resort, fallback to NGINX's default 404. This should never happen, and will emit a [crit]
try_files /$lang/404.html /en/404.html =404;
}

- "/home/nodejs/build-site.sh nodejs"

- "*/5 * * * * nodejs /home/nodejs/check-build-site.sh nodejs"

if [ "X$site" != "Xiojs" ] && [ "X$site" != "Xnodejs" ]; then
echo "Usage: check-build-site.sh < iojs | nodejs >"

if [ "X$site" != "Xiojs" ] && [ "X$site" != "Xnodejs" ]; then
echo "Usage: build-site.sh < iojs | nodejs >"

if [ "$site" == "nodejs" ]; then
build_cmd="npm run deploy"
rsync_from="build/"
else

I'm probably missing some other bits.

The first thing I can do is delete the website contents from the server. It's outdated now anyway. @nodejs/build-infra wdyt?

👍 Not build infra, but removing the site content seems like a good first step and would solve for the Google indexing issue (and act as a confirmation we don't need it before removing all the code/config for building/serving it).

I moved all website files to a folder at /home/www/nodejs_old in case we need to recover something, and updated the robots.txt to disallow everything.

Remains:

$ ls nodejs
robots.txt  traffic-manager
$ cat nodejs/robots.txt
User-Agent: *
Disallow: /

Opened #3641 to remove the build scripts and webhook.