Invalidate cache when using import

Question

Invalidate cache when using import

jonathantneal opened this issue 6 years ago · comments

How do I invalidate the cache of import?

I have a function that installs missing modules when an import fails, but the import statement seems to preserve the failure while the script is still running.

import('some-module').catch(
  // this catch will only be reached the first time the script is run because resolveMissingModule will successfully install the module
  () => resolveMissingModule('some-module').then(
    // again, this will only be reached once, but it will fail, because the import seems to have cached the previous failure
    () => import('some-module')
  )
)

The only information I found in regards to import caching was this documentation, which does not tell me where the “separate cache” used by import can be found.

No require.cache

require.cache is not used by import. It has a separate cache.
— https://nodejs.org/api/esm.html#esm_no_require_cache

snek · Answer 1 · Sat Apr 06 2019 01:01:05 GMT+0800 (China Standard Time)

the import cache is purposely unexposed. adding a query has been the generally accepted ecosystem practice to re-import something.

however, a failure to import something will not fill the cache.

this trivial program works fine for me (assuming nope.mjs does not exist):

import fs from 'fs';

import('./nope.mjs')
  .catch(() => fs.writeFileSync('./nope.mjs'))
  .then(() => import('./nope.mjs'))
  .then(console.log);

Jonathan Neal · Answer 2 · Sat Apr 06 2019 01:15:10 GMT+0800 (China Standard Time)

@devsnek, hmm, might this be limited to imports that use node_modules? This similarly trivial program fails for me the first time, but not the second.

import child_process from 'child_process';

import('color-names')
  .catch(() => child_process.execSync('npm install --no-save color-names'))
  .then(() => import('color-names'))
  .then(console.log);

Bradley Farias · Answer 3 · Sat Apr 06 2019 01:15:38 GMT+0800 (China Standard Time)

Note that the JS spec requires imports to be deterministic/idempotent on a source text. Exposure of a cache would not allow you to fix the code above.

…

On Fri, Apr 5, 2019, 12:01 PM Gus Caplan ***@***.***> wrote: the import cache is purposely unexposed. adding a query has been the generally accepted ecosystem practice to re-import something. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#49442>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAOUo5N5tioTq2s_t431OrihH_wH8Qagks5vd4FSgaJpZM4cfV4g> .

snek · Answer 4 · Sat Apr 06 2019 01:16:11 GMT+0800 (China Standard Time)

if its just happening with node_modules it could be #26926

Myles Borins · Answer 5 · Fri Aug 30 2019 04:08:55 GMT+0800 (China Standard Time)

can this be closd?

Jan Olaf Martin · Answer 6 · Fri Aug 30 2019 04:55:56 GMT+0800 (China Standard Time)

I think a use case like this would hopefully be implemented as a loader. Do we already track this as a use case in that context?

Bradley Farias · Answer 7 · Fri Aug 30 2019 21:44:18 GMT+0800 (China Standard Time)

@jkrems we have old documents with that as a feature, but no success criteria examples.

Gil Tayar · Answer 8 · Tue Nov 26 2019 13:09:25 GMT+0800 (China Standard Time)

FYI, I'm implementing ESM support in Mocha (mochajs/mocha#4038), and cannot currently implement "watch mode", whereby Mocha watches the test files, and reruns them when they change. So "watch mode" in Mocha, in the first iteration, will probably not support ESM, which is a bummer.

While we could use cache busting query parameters, that would mean that we are always increasing memory usage, and old and never-to-be-used versions of the file will continue staying in memory due to the cache holding on to them.

And I'm not sure a loader would help here, as the loader also has no access to the cache.

Guy Bedford · Answer 9 · Tue Nov 26 2019 13:27:42 GMT+0800 (China Standard Time)

An API for unloading modules certainly makes sense. Usually with a direct registry API there is the tracing issue. An API that handles dependency removal can be useful. A simple API might be something like - import { unload } from ‘module’; unload(import.meta.url); // returns true Where the unload function would remove that module including all its dependencies from the registry. If in a cycle the whole cycle would be removed. A subsequent module load would refresh all the loads anew. Other problems to ensure work out is what if modules in the tree are still in-progress. I’d be tempted to say it should fail for that case and only work when all modules have either errored or completed. We still have memory leak concerns as v8 doesn’t lend itself easily to module GC still. But Node.js can lead the way here as it should. It will be an ongoing process to get there, but the API can come first. The main questions then seem to be: * are we ok exposing this as a module or should it be tied to loaders? If tied to loaders how would userland code request this? Or don’t we want it to - as in Mocha should run a new context with a loader? * should it be a direct registry API (get/set) with tracing, or should it be a deep API like the example above * finally, ironing out the partially loaded tree edge cases

…

On Tue, Nov 26, 2019 at 00:09 Gil Tayar ***@***.***> wrote: FYI, I'm implementing ESM support in Mocha (mochajs/mocha#4038 <mochajs/mocha#4038>), and cannot currently implement "watch mode", whereby Mocha watches the test files, and reruns them when they change. So "watch mode" in Mocha, in the first iteration, will probably not support ESM, which is a bummer. While we *could* use cache busting query parameters, that would mean that we are always increasing memory usage, and old and never-to-be-used versions of the file will continue staying in memory due to the cache holding on to them. And I'm not sure a loader would help here, as the loader also has no access to the cache. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#49442?email_source=notifications&email_token=AAESFSTBSWVLXPTWDYZOHSDQVSVQRA5CNFSM4HD5LYQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFEXNDI#issuecomment-558462605>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAESFSWTJXGB4J6WWNMOLRDQVSVQRANCNFSM4HD5LYQA> .

snek · Answer 10 · Tue Nov 26 2019 14:07:48 GMT+0800 (China Standard Time)

I'm really not a fan of the idea of our module cache being anything except insert-only. CJS cache modification is bad already, and CJS modules don't even form graphs.

Additionally, other runtimes (like browsers) will never expose this functionality, so some alternative system will have to be used for them regardless of what node does, in which case it seems like that system could just be used for node.

Bradley Farias · Answer 11 · Tue Nov 26 2019 22:40:55 GMT+0800 (China Standard Time)

@giltayar have you looked into using Workers or other solutions to have a module cache that you can destroy (such as by killing the Worker)?

Gil Tayar · Answer 12 · Thu Nov 28 2019 12:49:28 GMT+0800 (China Standard Time)

@bmeck - interesting. That would mean that the tests themselves run in Workers. While I am theoretically familiar with workers, I haven't yet had any experience with them: is any code that runs in the main process compatible with worker inside a worker? In other words, compatibility-wise, would all test code that works today in the "main process" work inside workers?

I wouldn't want Mocha to have a version (even a semver-major breaking one) where developers will need to tweak their code because now it's running inside a worker. I'm guessing that there's a vast amount of that code running inside Mocha, and any incompatibility would be a deal breaker.

snek · Answer 13 · Thu Nov 28 2019 12:57:36 GMT+0800 (China Standard Time)

there are differences between workers and the main thread, mostly surrounding the functions on process, like process.exit() in a worker doesn't end the process, just the thread. There's a good list here: https://nodejs.org/api/worker_threads.html#worker_threads_class_worker

Gil Tayar · Answer 14 · Thu Nov 28 2019 13:13:12 GMT+0800 (China Standard Time)

Looking at the list, I can see process.chdir() is not available, which is probably a deal breaker in many tests (unit tests probably don't use process.chdir(), but Mocha is used for all sorts of tests), as is breaking some native add-ons (although I'm not sure how big of a problem this is in the real world).

I would hesitate to say this, as my only contribution to Mocha currently is this pull request, but I would guess that the owners would veto this. Or maybe allow this only if we add a --run-in-workers option. In any case, without looking too much at the code, this is probably a significant investment to implement for supporting ES Modules, as this is not a simple refactor, but rather an architectural change in how Mocha works.

Gil Tayar · Answer 15 · Thu Nov 28 2019 13:21:58 GMT+0800 (China Standard Time)

If it wasn't apparent from the above, I believe I would still prefer a "module unloading" API, unless the working group is adamant and official about not having one, of course. Which would probably mean going the "subprocess"/"worker" route.

snek · Answer 16 · Thu Nov 28 2019 13:44:32 GMT+0800 (China Standard Time)

i admittedly don't know much about mocha... is using a separate process not doable either?

Gil Tayar · Answer 17 · Thu Nov 28 2019 15:14:08 GMT+0800 (China Standard Time)

I'll go back to the Mocha contributors team with this.

Christopher Hiller · Answer 18 · Tue Dec 03 2019 07:51:06 GMT+0800 (China Standard Time)

Hi, I work on Mocha!! I am trying to see how we can move @giltayar's PR forward.

There are actually two situations in which "module unloading" is needed in Mocha:

In "watch" mode with CJS scripts, when Mocha detects a file must be reloaded, it is deleted from require.cache and re-required, then tests are re-run. Mocha is not the only tool that does this sort of cache-busting.
When developers are writing tests with Mocha (and many other test frameworks), they may want to use module-level mocking--they essentially replace one module with another phony one (I'm going to tag @theKashey here because he knows more about this ). Or even pretend like a module does not exist at all. It is then very important to Mocha that users can consume these sort of mocking frameworks to write their test code.

In the first case, it's possible, though probably at a performance cost, Mocha could leverage workers to handle ESM. I don't know enough about workers to say whether this will provide a sufficient environment for the test cases, but it feels like a misuse of the workers feature. At minimum it seems like a lot of added complexity.

In the second case, I can't see how using workers would be feasible. Test authors need to be able to mock modules on-the-fly and reference them directly from test cases, using mocking frameworks.

I don't know why this sort of behavior was omitted from the official specification. If the reasons involve "browser security", well, it further reinforces that browsers are a hostile environment for testing. I do know that this behavior is a very real need for many, from library and tooling authors down to developers working on production code.

We do need an "unload module" API; until such a thing lands, tools will be limited, implementations will be difficult (if possible), and end users will be frustrated when their tests written in ESM don't work. I will also be frustrated, because those frustrated users will complain in Mocha's issue tracker!

I'm happy to talk in further detail about use-cases, but I'm eager to put an eventual API description in the more-capable hands of people like @guybedford.

@devsnek Given that enabling it also enables tooling, I'm curious why you feel locking this sort of thing down is a better direction?

cc @nodejs/tooling

P.S. I will be at the collab summit, and the tooling group will be hosting a collaboration session; maybe this can be a topic of discussion, or vice-versa if there's a modules group meeting...?

Anton Korzunov · Answer 19 · Tue Dec 03 2019 11:29:27 GMT+0800 (China Standard Time)

Do you need unloading API for the watch mode? Yes, you need it to update the changed module code.

However, it is enough to handle watch mode? No, as long as the idea is to use changed module, as you have to find parents between you(a test) and changed module, and wipe them to perform a proper reinitialization.

So - an ability to invalidate a cache line is not enough, for the mocking task we also have to know the cache graph, we could traverse and understand which work should be done.

An API for unloading modules certainly makes sense.

In my opinion - this feature is something missing for a proper code splitting. There are already 100Mb bundles, separated into hundreds of pieces, you will never load simultaneously. But you if will - there is no way to unload them. Eventually, the Page or an Application would just crash.

Gil Tayar · Answer 20 · Wed Dec 04 2019 13:34:59 GMT+0800 (China Standard Time)

@boneskull - the second case you mentioned, I believe can and should be handled by module "loaders", which are a formal way to do "require hooks" for ESM. These will enable testing frameworks (like sinon and others) to manipulate how ES modules are loaded, and, for example, exchange other modules for theirs.

The spec and implementation for that are actively being discussed and worked on by the modules working group (see nodejs/modules#351).

jonerer · Answer 21 · Wed Feb 19 2020 21:09:04 GMT+0800 (China Standard Time)

I also need this. I'm making a template rendering engine. When generating the compiled template, I read from a custom format and output to a .js file (a standard ES Module). In order to use the file, I just import it. Upon file changes, I would like to re-write the file, clear the import cache and then re-import it.

snek · Answer 22 · Thu Feb 20 2020 00:54:54 GMT+0800 (China Standard Time)

These all sound like use cases for V8's LiveEdit debug api (https://chromedevtools.github.io/devtools-protocol/v8/Debugger#method-setScriptSource). You can call into it using https://nodejs.org/api/inspector.html. cc @giltayar @boneskull

Georges Gomes · Answer 23 · Fri Mar 20 2020 15:18:11 GMT+0800 (China Standard Time)

+1 for unloading ES Modules.
It's hard to make Hot Module Reload otherwise. Not for production but for development tools.
And using a ?query=x doesn't seem to work on file node 13.11.0 at least.
Thanks

Georges Gomes · Answer 24 · Fri Mar 20 2020 16:53:31 GMT+0800 (China Standard Time)

@devsnek Can you provide a little example or pseudo-code on usage of setScriptSource. I have been researching for an 1hour without progress. Thanks

Georges Gomes · Answer 25 · Fri Mar 20 2020 16:59:48 GMT+0800 (China Standard Time)

@devsnek ok I progressed, I will post my findings back

snek · Answer 26 · Fri Mar 20 2020 17:00:09 GMT+0800 (China Standard Time)

@georges-gomes you can subscribe to the Debugger.scriptParsed event to track the script id, and then when you need to modify the script you can call Debugger.setScriptSource.

jonerer · Answer 27 · Fri Mar 20 2020 17:09:32 GMT+0800 (China Standard Time)

@georges-gomes If you are successful, I would be very grateful if you could post a short description on how you could use setScriptSource to solve this problem. On a blog post or something.

Georges Gomes · Answer 28 · Fri Mar 20 2020 18:09:22 GMT+0800 (China Standard Time)

@lulzmachine here is a working prototype https://gist.github.com/georges-gomes/6dc743addb90d2e7c5739bba00cf95ea

Unfinished but working. I have seen a few unexpected issues but let see how far we can get with this.
Thanks @devsnek 👍

Georges Gomes · Answer 29 · Fri Mar 20 2020 21:11:42 GMT+0800 (China Standard Time)

@devsnek I get segmentation fault if I start using import in the new loaded script. I'm not sure setScriptSource supports ES Modules

Georges Gomes · Answer 30 · Fri Mar 20 2020 21:28:25 GMT+0800 (China Standard Time)

The current issues I have:

calling setScriptSource with the exact same source => segmentation fault
Loading a class from a new import and then extend the existing class =>

#
# Fatal error in , line 0
# Check failed: args[1].IsJSObject().
#
#
#
#FailureMessage Object: 0x7ffeefbf6860
 1: 0x1001000d2 node::NodePlatform::GetStackTracePrinter()::$_3::__invoke() [node]
 2: 0x100ef507f V8_Fatal(char const*, ...) [node]
 3: 0x1006576ee v8::internal::Runtime_LoadFromSuper(int, unsigned long*, v8::internal::Isolate*) [node]
 4: 0x1009b0af4 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvInRegister_NoBuiltinExit [node]
 5: 0x100a1b0ce Builtins_CallRuntimeHandler [node]
 6: 0x10093cabb Builtins_InterpreterEntryTrampoline [node]
 7: 0x10093cabb Builtins_InterpreterEntryTrampoline [node]
zsh: illegal hardware instruction

If the new import was previously imported then it works. I can't see any import happening so I can only guess that setScriptSource doesn't trigger module loading if missing.

Bradley Farias · Answer 31 · Fri Mar 20 2020 21:53:00 GMT+0800 (China Standard Time)

It seems v8 is fixing some bugs with Module and LiveEdit (setScriptSource) still : https://bugs.chromium.org/p/v8/issues/detail?id=10341&q=setScriptSource&can=2

Bradley Farias · Answer 32 · Fri Mar 20 2020 21:55:17 GMT+0800 (China Standard Time)

I'd also clarify, setScriptSource does not evaluate the outer most scope of a source text when it is applied. LiveEdit takes place by replacing frames that are entered after it is called.

Georges Gomes · Answer 33 · Fri Mar 20 2020 22:25:11 GMT+0800 (China Standard Time)

@bmeck that's probably why import is not happening.

Tim Daubenschütz · Answer 34 · Wed Jan 13 2021 00:30:09 GMT+0800 (China Standard Time)

the import cache is purposely unexposed.

Why?

this trivial program works fine for me (assuming nope.mjs does not exist):

Fair enough. For me, however the following poses a problem

const { writeFileSync } = require("fs");
const assert = require("assert");


(async () => {
  const filename = "abc.js";
  const num = 123
  const content = `module.exports = ${num}`

  writeFileSync(filename, content);
  assert((await import(filename)).default === num) // true

  const newNum = 456;
  const newContent = `module.exports = ${newNum}`;
  writeFileSync(filename, newContent);

  assert((await import(filename)).default === newContent) // false because of cache
})();

With require, it was easy to invalidate its cache. How would I implement the above with import?

Jordan Harband · Answer 35 · Wed Jan 13 2021 00:32:14 GMT+0800 (China Standard Time)

At the moment, i don’t believe you can.

Andrea Giammarchi · Answer 36 · Wed Jan 13 2021 00:42:28 GMT+0800 (China Standard Time)

maybe late ... but the only reason I have {"type": "commonjs"} in all my test/ folders is because of code coverage which is impossible to have it 100% without cache invalidation (polyfills, different versions of nodejs, different envs, etc.)

accordingly, while I think cache invalidation would be bad in production in general, having a way to hot-reload modules, hence invalidate these, has a proven, long history, of usefulness.

if node only could expose any way to, at least, invalidate relative imports, as opposite of well known modules, it'd be great.

node --allow-import-invalidate test.js

// test.js
import('../thing.js').then(module => {
  // do something with module
  import.invalidate('../thing.js');
  // change something in the env
  import('../thing.js').then(module => {
    // do something else with the new module
  });
});

Tim Daubenschütz · Answer 37 · Wed Jan 13 2021 00:49:09 GMT+0800 (China Standard Time)

accordingly, while I think cache invalidation would be bad in production in general, having a way to hot-reload modules, hence invalidate these, has a proven, long history, of usefulness.

This is exactly my problem too. I only need to have a fresh require invocation for each test.

maybe late ... but the only reason I have {"type": "commonjs"} in all my test/ folders is because of code coverage

What exactly are you referring to with {"type": "commonjs"}? Docs?

Jordan Harband · Answer 38 · Wed Jan 13 2021 00:53:19 GMT+0800 (China Standard Time)

@TimDaub it’s the default. It’s only needed if a parent package.json specifies type module (which does one thing: makes .js files be treated as ESM instead of CJS)

Bradley Farias · Answer 39 · Wed Jan 13 2021 00:53:37 GMT+0800 (China Standard Time)

There are issues with the constraints on ESM by the spec regarding invalidation is a large topic still at TC39. Snowpack is in talks with module reloading (not with cache invalidation) in this area. Slides were made from talks following a Realms call on the topic. For now even if we expose the cache, it likely won't do what you want with how ESM is specced.

Andrea Giammarchi · Answer 40 · Wed Jan 13 2021 01:07:30 GMT+0800 (China Standard Time)

I don't expect import.invalidate to ever land on the Web and I personally don't want that to ever happen, which is why I've empathized "node only". Cache invalidation is bad on CJS too imho, but it's handy for development reasons (and never for production, in my experience).

As node is used as coverage tool, including its c8 helper, having no way to improve ESM modules code coverage, if not by running the same test multiple times with different versions of node, something that won't likely sum up coverage within its exported data, seems a big limitation.

I personally develop, and publish, dual modules, which is why I can use the CJS version of my modules within the test folder and invalidate these whenever I need, if I need, but as we're moving forward, I'd like to stop being forced to publish dual modules because I can't code-cover their cross-env/browser/node behavior.

As summary: does this need to involve TC39, instead of being a technical decision made in node, for node only?

Bradley Farias · Answer 41 · Wed Jan 13 2021 01:10:56 GMT+0800 (China Standard Time)

@WebReflection with the mandates from https://tc39.es/ecma262/#sec-hostresolveimportedmodule and other host hooks, yes it does need TC39 to loosen those somehow or work around the issue

Andrea Giammarchi · Answer 42 · Wed Jan 13 2021 01:15:31 GMT+0800 (China Standard Time)

@bmeck but couldn't a special flag enforce ignoring this step?

Each time this operation is called with a specific referencingScriptOrModule, specifier pair as arguments it must return the same Module Record instance if it completes normally.

Something like this:

node --expose-dyamic-import-invalidation-at-your-own-risk-and-with-performance-issues

would work ... literally any way would work, as long as there's a work-around, otherwise dual modules it is to me, as that worked well to date.

Bradley Farias · Answer 43 · Wed Jan 13 2021 01:18:04 GMT+0800 (China Standard Time)

@WebReflection it would require altering the VM (V8) to allow this, V8 generally is fragile enough around modules (see long outstanding https://bugs.chromium.org/p/v8/issues/detail?id=10284 ). I don't think this would be simpler than import.meta.hot that was talked about and a simple signaling mechanism.

Andrea Giammarchi · Answer 44 · Wed Jan 13 2021 01:20:29 GMT+0800 (China Standard Time)

@bmeck well, if import.meta.hot solves this, I'll happily wait. It wasn't mentioned in this thread, and it's the first time I read about it. If there's any link around this topic, I'd love to read it and try to figure out if that solves the current limitation, thanks.

snek · Answer 45 · Wed Jan 13 2021 01:22:43 GMT+0800 (China Standard Time)

afaict, everyone who wants this functionality actually wants HMR. Maybe it would be more productive to bug a V8 product manager about HMR than to bug node about breaking cache invariants we don't control.

Andrea Giammarchi · Answer 46 · Wed Jan 13 2021 01:32:53 GMT+0800 (China Standard Time)

@devsnek we're having a conversation and it's been productive to me, as I've learned about import.meta.hot which I didn't know. As I still think HMR should not land on the Web, I was hoping node could've done something to help having HMR in development mode, but if that's not the case, then this issue could, as well, be closed.

Katja Lutz · Answer 47 · Wed Jan 13 2021 03:49:03 GMT+0800 (China Standard Time)

Anyone here tried to just rename the lib folder of your project to lib1, lib2, lib3, counting up, each time a file changes? This might be a workaround 😅🙈

Geoffrey Booth · Answer 48 · Wed Jan 13 2021 06:31:43 GMT+0800 (China Standard Time)

@WebReflection I think the issue is deeper than Node (others can correct me). Even if Node invalidates its cache, V8 won’t let it replace the ES module that’s already been loaded in V8. At least, that’s how things stand at the moment with V8, as far as I know. There was hope that a DevTools protocol, Debugger.setScriptSource if I’m remembering correctly, would let us tell V8 to change the contents of a loaded module; but that turned out not to work out.

Guy Bedford · Answer 49 · Thu Jan 14 2021 00:36:30 GMT+0800 (China Standard Time)

Tbh a require('module').globalCache Map being exposed might not be the end of the world and surely doesn't need TC39. The hard part is not exposing the private module wrap interface and wanting to provide dependency graph metadata for clearing ancestors.

Jordan Harband · Answer 50 · Thu Jan 14 2021 00:38:55 GMT+0800 (China Standard Time)

Wouldn't a global module map be incompatible with import maps support, since each module potentially has a contextually scoped module map?

Guy Bedford · Answer 51 · Thu Jan 14 2021 00:40:34 GMT+0800 (China Standard Time)

The import map is just a specific resolver implementation, so no different to the existing hooks we already have.

Bradley Farias · Answer 52 · Thu Jan 14 2021 00:43:40 GMT+0800 (China Standard Time)

@guybedford currently it is unsafe to populate the same URL twice in V8 in the same context per the issue linked above, it segfaults usually

Guy Bedford · Answer 53 · Thu Jan 14 2021 00:47:57 GMT+0800 (China Standard Time)

Right, it sounds like a fix for that is a good first step then indeed.

Bradley Farias · Answer 54 · Thu Jan 14 2021 00:52:00 GMT+0800 (China Standard Time)

@guybedford per the issue above, it has an existing changeset that fixes it but it is no longer assigned and several attempts to bump it have been made in various locations. Even if we do change it, GC currently is completely disabled. Per Snowpack, the workflows being looked at are around signaling and manual alteration via import.meta.hot not around exposing the Map of URL=>instance, also URL=>instance isn't stable as asserts are and other things might be additional parts of the cache key. Exposing the cache key is likely unstable enough to not want to do so.

Guy Bedford · Answer 55 · Thu Jan 14 2021 00:54:35 GMT+0800 (China Standard Time)

around signaling and manual alteration via import.meta.hot

Will this support named exports mutations? How will export * be handled? Doesn't this just get into the same problems of dynamic modules?

Working with v8 to make the cache key stable is important, yes including the GC issues. Node.js and Deno definitely are the drivers of this work. If v8 fork / custom patches are needed then even that makes sense, as having control of the module system is important to a JS platform!

Bradley Farias · Answer 56 · Thu Jan 14 2021 01:00:44 GMT+0800 (China Standard Time)

Will this support named exports mutations? How will export * be handled? Doesn't this just get into the same problems of dynamic modules?

No, they completely reload with new absolute cache keys in their implementation. export * causes strong linkage that requires that whole subsection of the graph to reload if it reloads. Per https://github.com/snowpackjs/esm-hmr the idea is to manually replace locals for the boundary locations of the replacement. I had made some slides on this just to visualize their approach after that call with TC39, @JoviDeCroock likely could speak better on details than I.

Guy Bedford · Answer 57 · Thu Jan 14 2021 01:03:59 GMT+0800 (China Standard Time)

Two-pronged approach seems fine yes. But then full reload scenario exactly relies on what is being discussed here - from v8 bugs to the Node.js API per the last comments.

Bradley Farias · Answer 58 · Thu Jan 14 2021 01:06:13 GMT+0800 (China Standard Time)

@guybedford what prevents full reload from following the same workflow? If the entire graph is strongly connected it would still work I believe.

Guy Bedford · Answer 59 · Thu Jan 14 2021 01:19:32 GMT+0800 (China Standard Time)

@bmeck not sure I understand, do you mean always relying on the import.meta.hot changing local specifiers?

Bradley Farias · Answer 60 · Thu Jan 14 2021 01:22:46 GMT+0800 (China Standard Time)

@guybedford yes, and if the entire graph is reloaded (full reload) there isn't a need for changing local bindings.

Guy Bedford · Answer 61 · Thu Jan 14 2021 01:26:03 GMT+0800 (China Standard Time)

@bmeck right, but you don't want to refresh the entire graph is the point. I think we do need the ability to refresh subgraphs as having the options being full reload only (as in, restarting Node.js) or local bindings only seems arbitrarily restricting.

Jason Miller · Answer 62 · Thu Jan 14 2021 01:28:13 GMT+0800 (China Standard Time)

I scanned and haven't seen this mentioned anywhere yet - given that the cache can be bypassed by passing a querystring parameter:

let c = 0;
function importFresh(mod) {
  return import(`${mod}?v=${++c}`);
}

... couldn't the issue here be reframed to "it is not possible to update a module's exports after it has been imported"?

A workaround like this is what I've been noodling on:

// convert exports to non-const bindings
export var Foo = class Foo {};
export var bar = 42;

import.meta.hot.accept(async ({ module }) => {
  /* These don't work: */
  // Object.assign(await import(import.meta.url), module);
  // Object.defineProperties(await import(import.meta.url), Object.getOwnPropertyDescriptors(module));

  ({ Foo, bar } = module);
});

Jovi De Croock · Answer 63 · Thu Jan 14 2021 01:31:19 GMT+0800 (China Standard Time)

Well the exports themselves alone aren't sufficient, imagine a scenario where the following happens:

const x = 8;
export const getFoo = () => 1 + x;

Changing the getFoo isn't sufficient for this scenario, we would need to replace the entire module in this case.

Jason Miller · Answer 64 · Thu Jan 14 2021 01:41:13 GMT+0800 (China Standard Time)

Replacing the export to point to the new module instance should be sufficient - getFoo becomes a reference to the new module's exported getFoo, which has its own copy of x. For preserving/modifying state, that seems like a concern that would be external to module cache invalidation.

There is a case where this is fully broken though, which is when exports are added in an updated version of a module that were not previously defined.

Geoffrey Booth · Answer 65 · Thu Jan 14 2021 08:33:25 GMT+0800 (China Standard Time)

couldn’t the issue here be reframed to “it is not possible to update a module’s exports after it has been imported”?

There’s a loader built by @giltayar around this principle, that appends unique timestamps to specifiers so that psuedo-HMR can work. The issue is that the no-longer-needed earlier versions of such modules are never removed from memory, and so over the course of hours while you’re working (and hot-reloading the same module over and over and over) Node will eventually run out of memory and crash.

Bradley Farias · Answer 66 · Thu Jan 14 2021 09:18:37 GMT+0800 (China Standard Time)

The issue is that the no-longer-needed earlier versions of such modules are never removed from memory, and so over the course of hours while you’re working (and hot-reloading the same module over and over and over) Node will eventually run out of memory and crash.

We can't really make an assumption that HMR won't leak. Even in CJS it leaks.

Geoffrey Booth · Answer 67 · Thu Jan 14 2021 12:04:35 GMT+0800 (China Standard Time)

We can't really make an assumption that HMR won't leak.

Fair, but at least that's accidental. The query string approach guarantees eventually running out of memory.

Andrea Giammarchi · Answer 68 · Thu Jan 14 2021 14:23:32 GMT+0800 (China Standard Time)

quick one about the whole graph:

The hard part is not exposing the private module wrap interface and wanting to provide dependency graph metadata for clearing ancestors.

to have at least parity with CJS, when I delete require.cache[require.resolve('../path')] in CJS, it doesn't invalidate its required modules, neither relative paths nor installed.

accordingly, I don't think invalidating the whole subtree is needed, or even desired, or at least I could deal with the cache exposed the way CJS does.

Tim Daubenschütz · Answer 69 · Thu Jan 14 2021 17:10:59 GMT+0800 (China Standard Time)

quoting from @developit's suggestion:

couldn't the issue here be reframed to "it is not possible to update a module's exports after it has been imported"?

While I think it's a valid point to ask for simplification of the matter, I still disagree with doing so.
I think the point here is that node users are expecting a cache dictionary at require.cache that can simply be deleted by using the delete keyword, so unsurprisingly they do expect the same functionality when the module system is upgraded (esm import).
Disregarding all the discussions around standards etc., the most reasonable way of fixing the issue to me would hence be to introduce a cache dictionary on import and allow the user to delete entries. Surely there has already been put a lot of thought into require.cache when it was implemented in node.

While adding a query string may be a valid workaround to re-import modules, it is logically something different than clearing a specific part of a cache. As others have noted, it can additionally lead to memory leaks.

IMO, any functionality that goes beyond that (e.g. fancy hot module replacement) is a specific technical vision and should be handled separately.

Andrea Giammarchi · Answer 70 · Thu Jan 14 2021 17:16:53 GMT+0800 (China Standard Time)

introduce a cache dictionary on import to and allow the user to delete entries

strong agree with you, but require.cache was likely born in times Map was not a thing, so that I personally wouldn't mind if the import cache is exposed as map, just to have it aligned with the more-modern JS it's representing.

import.meta.cache.delete('../path.js');

snek · Answer 71 · Fri Jan 15 2021 02:05:11 GMT+0800 (China Standard Time)

fwiw require.cache was cited during the design of ESM as one of the motivations for an immutable cache. @TimDaub require.cache was more or less added at the whim of one of the early designers of node. I'm also curious if your specific use case is HMR.

Guy Bedford · Answer 72 · Fri Jan 15 2021 02:15:46 GMT+0800 (China Standard Time)

If digging up past arguments here, then it's worth noting that a mutable registry map was always an original design goal for ESM loaders - https://whatwg.github.io/loader/#registry-constructor.

Bradley Farias · Answer 73 · Fri Jan 15 2021 02:53:29 GMT+0800 (China Standard Time)

@guybedford correct, but that was not pursued to completion due to various issues and some of those original participants are on the calls mentioned above.

Vasanth Srivatsa · Answer 74 · Tue Jan 19 2021 04:35:35 GMT+0800 (China Standard Time)

So, there's no way to implement hot reloading during development while using ESM modules? 😥

I tried adding a query parameter with random string to workaround the cache, but I can do it for multiple files, I can only do for the index file, since I'm trying to build a library that provides hot module reloading. Any help would be greatly appreciated 🙂

Bradley Farias · Answer 75 · Tue Jan 19 2021 04:39:07 GMT+0800 (China Standard Time)

@vasanthdeveloper you can look at https://github.com/snowpackjs/esm-hmr and/or https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1 for different approaches to do this. Note that loaders remain unstable though.

Jovi De Croock · Answer 76 · Tue Jan 19 2021 04:51:09 GMT+0800 (China Standard Time)

We have gotten esm-hmr to work in the browser all though this is a pretty "unconventional" approach in the sense that the original Module will actually stay in the browser.

As you can see in esm-hmr on first serve we'll attempt at creating a moduleGraph Map which lists a module with it's dependencies, dependents and whether or not it has a module.hot.accept in the code.

When an update happens to a module that accepts updates we'll fetch said module appended with a ?mtime=x query parameter, this means that at this point we'll get the new module in-memory but we can't just inject it into the currently in-use module, so frameworks currently write code to hot-replace these. Prefresh being one of these. This code will have logic for a specific framework, in this case Preact, to hot-replace a Component.

When a child updates that doesn't accept its own updates we bubble up and treat parents that do accept updates as boundaries for updating these children. When we encounter such a boundary we'll have to deeply rewrite the imports for the path leading up to said child with the same technique of ?mtime=x.

The tricky part presents itself when subsequent updates happen as explained here

When we update Counter.jsx, the child of Internal.jsx we do the following implicitly import Counter.jsx?mtime=Date.now() this will make our babel-functions re-run and register the new type for Counter.jsx. For in place updates this works great, but when we now update Internal.jsx we'll be importing the old variant of Counter.jsx since during esm-hmr we have no way to update this ModuleRecord in place, essentially the Counter.jsx?mtime=Date.now() is orphaned and disposed instantly.

internal.jsx updates with the old reference, without checking for a new one to exist (byte-cache), this makes it render with the initial code rather than the new.

This to sum-up the current less than ideal approach we are taking in the browser to circumvent this caching issue, I know this is the nodejs repository but could be a useful bit of information.

Vasanth Srivatsa · Answer 77 · Thu Jan 21 2021 19:10:03 GMT+0800 (China Standard Time)

I bypassed ESM caching by making my own loader which appends a random string as a query at the end of each file being imported. Here's the code I made for testing 👇

import { URL } from 'url'

export const resolve = async (specifier, context, defaultResolve) => {
    const result = defaultResolve(specifier, context, defaultResolve)
    const child = new URL(result.url)

    if (
        child.protocol === 'nodejs:' ||
        child.protocol === 'node:' ||
        child.pathname.includes('/node_modules/')
    ) {
        return result
    }

    return {
        url: child.href + '?id=' + Math.random().toString(36).substring(3),
    }
}

To see if this leaks memory, I made a chain of JS files that import, and then continuously changed their contents 1000+ times, and monitored the RAM usage of node and it seems all normal to me. 🤷‍♂️ ESM cache seems to be cleared automatically as soon as no other file imports that particular file.

Before starting node, I add my loader using the following command 👇

node --no-warnings --experimental-loader ./hot.js src/index.js

Thanks to @bmeck for guiding me 😅

Guy Bedford · Answer 78 · Thu Jan 21 2021 19:20:33 GMT+0800 (China Standard Time)

@vasanthdeveloper module management is bytes not megabytes :) They are certainly not cleared. Try allocating a 1MB string in each module. But yes practically we may be able to go far on just not dealing with GC problems, untill that hits a wall of course.

Vasanth Srivatsa · Answer 79 · Thu Jan 21 2021 19:25:07 GMT+0800 (China Standard Time)

I'm gonna try adding 1MB string in those JavaScript files. But since it's only for development, I think it's tolerable.

Aral Balkan · Answer 80 · Tue Feb 23 2021 21:30:15 GMT+0800 (China Standard Time)

Also using the query string method for the in-progress ESM version of JavaScript Database (JSDB), and it’s not tenable due to the memory leak (my append-only JavaScript data files can be hundres of megabytes in size).

I’d love to see Node pave the cowpaths on this and support the query-string method of ESM cache-busting by garbage collecting the previous version of the module when it detects the practice. (I haven’t peeked into the code so this is probably way easier said than done.)

Bradley Farias · Answer 81 · Tue Feb 23 2021 21:58:19 GMT+0800 (China Standard Time)

@aral V8 won't GC the module once it is linked, please note that Node does not modify V8 generally and so the issue of allowing GC of modules is likely better on their issue tracker.

Marcin K · Answer 82 · Wed Feb 24 2021 04:15:37 GMT+0800 (China Standard Time)

Can we raise the issue in V8 to provide C++ API for hot reloading modules? wt., 23 lut 2021 o 14:58 Bradley Farias <notifications@github.com> napisał(a):

…

@aral <https://github.com/aral> V8 won't GC the module once it is linked, please note that Node does not modify V8 generally and so the issue of allowing GC of modules is likely better on their issue tracker. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#49442>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNWS3ZAQ4WJCYGZD4XV65LTAOYA7ANCNFSM4HD5LYQA> .

MattXYZ · Answer 83 · Thu Jun 10 2021 23:39:42 GMT+0800 (China Standard Time)

@giltayar have you looked into using Workers or other solutions to have a module cache that you can destroy (such as by killing the Worker)?

+100000000 for using Workers. I created a small worker script that received the module path and some arguments (as an object) for the module in the workerData object. The worker uses a dynamic import to load the module and execute it using the arguments. It then just returns the result to the parent. Since the worker executes in it's own module context there is no need to invalidate cache. Works perfectly. ❤️

Node.js 14.x lts

import { Worker } from 'worker_threads'

const worker = new Worker('./module_executor.js', {
  workerData: { modulePath: './path/to/module.js', args: { arg1: 1, arg2: 2 } }
})

worker.once('message', result => {
  console.log(result)
})

worker.once('error', error => {
  console.error(error)
})

import { workerData, parentPort } from 'worker_threads'

async function executeModule({ modulePath, args }) {
  const { default: mod } = await import(modulePath).then(module => module.default)
  // const { mod } =  await import(modulePath).then(module => module.default) //  depending on how you exported

  let result;
  if (mod.constructor.name === 'AsyncFunction') {
    result = await mod(args)
  } else {
    result = mod(args)
  }

  parentPort.postMessage(result)
}

executeModule(workerData)

Lukas Siemon · Answer 84 · Fri Jan 28 2022 03:30:57 GMT+0800 (China Standard Time)

@vsnthdev No idea why your solution uses so little memory... but it actually works!

PS: I'm currently in the process of migrating lambda-tdd and node-tdd from require to import.

Vasanth Srivatsa · Answer 85 · Fri Jan 28 2022 09:06:27 GMT+0800 (China Standard Time)

@simlu Thank you 😊

But please refrain from using this in production.

I used to use this technique to primarily have HMR support during development 🙂

Lukas Siemon · Answer 86 · Tue Feb 01 2022 00:09:56 GMT+0800 (China Standard Time)

@vsnthdev Oh I would never. Just for lots and lots of test suites =)

Sebastian Landwehr · Answer 87 · Wed Feb 02 2022 21:21:30 GMT+0800 (China Standard Time)

Still only invalidates the cache for the imported module, not for the indirect dependencies. So still no switching to ESM 😟.

Jürg Lehni · Answer 88 · Sun Mar 20 2022 17:24:58 GMT+0800 (China Standard Time)

@vsnthdev I've used your method to implement cache blasting in mocha --watch mode for ESM imports, and it works!

mochajs/mocha#4374 (comment)

Vasanth Srivatsa · Answer 89 · Sun Mar 20 2022 18:54:55 GMT+0800 (China Standard Time)

@vsnthdev I've used your method to implement cache blasting in mocha --watch mode for ESM imports, and it works!

mochajs/mocha#4374 (comment)

I am glad it helped you 😊

Lukas Bombach · Answer 90 · Fri Jun 24 2022 17:28:51 GMT+0800 (China Standard Time)

This might just be the dumbest thing to do (and will probably do bad things to node's cache system), but you can add random parameters to your imports to get an uncached version

const myModule = await import(`./myFile?cachebust=${Date.now()}`)

Needless to say that this isn't for a prod system

Lukas Siemon · Answer 91 · Fri Jun 24 2022 22:19:35 GMT+0800 (China Standard Time)

Here is the hot reload file we use to achieve the balance between cache invalidation and cache reuse to be able to run hundrets of tests without hitting memory limits but also invalidating the necessary files.

https://github.com/blackflux/robo-config-plugin/blob/master/test/projects/assorted/%40npm-opensource/test/hot.js

We invalidate by environment variables and by comment. This requires to add strategic comments to the code to invalidate the necessary files.

For inspiration or for use as is. This took some fine tuning. Do not use in production

Ren Hiyama · Answer 92 · Thu Jul 14 2022 23:19:16 GMT+0800 (China Standard Time)

Ah yes the reason why I wondering how @remix-run 's dev mode had memory leak, turns out its because of cache busting...
Btw may I know the reason why cache removal is not coming to nodejs' esm? I cant even differentiate whether nodejs focuses on browser spec like deno, or server side stuff...
Why talking about deno here? Because they have import mappings, and here we cant even get unloading cache, like its not that dangerous? Pretty sure its going to be used in dev mode and not in production. Like come on, dont tell me hackers will hack via this method! There are literally a lot of ways to easily hack a pc if hackers can get their hands on them!

Ren Hiyama · Answer 93 · Thu Jul 14 2022 23:24:32 GMT+0800 (China Standard Time)

Cant we just keep it simple as it was before? Workers need a lot more lines + complexity unlike deleting require cache 😄

Zack Schuster · Answer 94 · Fri Jul 15 2022 04:45:01 GMT+0800 (China Standard Time)

@renhiyama essentially, v8 is in control of the module cache & node has no way to interact with it. i think there was some discussion around getting v8 to add an api, but it's quite an uphill trek & it's not likely to be undertaken vigorously when cache-busting can be achieved in userland via the workaround provided above.

Ren Hiyama · Answer 95 · Fri Jul 15 2022 08:03:38 GMT+0800 (China Standard Time)

@zackschuster but cache busting creates a memory leak since the old ones are not wiped out unless using workers? (or maybe even using workers, I didnt test workers till now since it looks complex than clearing require cache)

timiyay · Answer 96 · Fri Jul 15 2022 08:26:15 GMT+0800 (China Standard Time)

Does the experimental ESM Loader Hooks API, released this week in NodeJS 18.60, provide any new avenues for solving this problem?
https://github.com/nodejs/node/releases/tag/v18.6.0
https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o

Zack Schuster · Answer 97 · Fri Jul 15 2022 09:03:25 GMT+0800 (China Standard Time)

but cache busting creates a memory leak since the old ones are not wiped out

hot reloading in a browser has the same issue, incidentally.

unless using workers? (or maybe even using workers, I didnt test workers till now since it looks complex than clearing require cache)

i think workers have ties to the FinalizationRegistry that help ensure they get cleaned up, but don't quote me on that.

Ren Hiyama · Answer 98 · Fri Jul 15 2022 14:47:27 GMT+0800 (China Standard Time)

I like how chrome doesn't even try a single thing to lower the ram usage, not even providing alternative methods so memory leaks doesnt happen...

rektide · Answer 99 · Wed Sep 07 2022 02:18:20 GMT+0800 (China Standard Time)

This cowpath should be paved. The original intent of shipping a Module Registry for EcmaScript Modules was a good one, that recognized a valid need very very widely faced. Trying to un-ship this capability denies us what we need. Give the users what they want.

Does the experimental ESM Loader Hooks API, released this week in NodeJS 18.60, provide any new avenues for solving this problem? https://github.com/nodejs/node/releases/tag/v18.6.0 https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o

Vsnthdev's's workaround seemingly uses the programmable esm loader hooks to redirect every request to a new uniquely named request, which seemingly works ok.

Aral Balkan · Answer 100 · Fri Sep 09 2022 17:25:39 GMT+0800 (China Standard Time)

Does the experimental ESM Loader Hooks API, released this week in NodeJS 18.60, provide any new avenues for solving this problem?
https://github.com/nodejs/node/releases/tag/v18.6.0
https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o

Nope, sadly not.

Also, if the officially-linked article (see end of https://dev.to/jakobjingleheimer/custom-esm-loaders-who-what-when-where-why-how-4i1o) for how to do ESM cache invalidation with Node.js (https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1) includes (count them) four “sneaky!” exclamations while describing the method, that’s a smell.

Not mentioned in that article: that the method leaks memory.

Sneaky!