N-API: An api for embedding Node in applications
empyrical opened this issue · comments
Is your feature request related to a problem? Please describe.
Right now there isn't a documented/stable way to use Node as a shared library inside of an application. Were one to be made using N-API, this would open up using Chakra in addition to V8 in an application.
Describe the solution you'd like
I would like for there to be stable APIs in node_api.h
for creating/managing a Node environment.
A function that does this could hypothetically look like:
NAPI_EXTERN napi_status napi_create_env(int* argc, const char** argv, napi_env* env);
// Start the node event loop
NAPI_EXTERN napi_status napi_run_env(napi_env env);
// Cleanup (e.g. FreeIsolateData, FreeEnvironment and whatever else needs to be ran on teardown)
NAPI_EXTERN napi_status napi_free_env(napi_env env);
The embedder could get this environment's libuv loop using napi_get_uv_event_loop
. But I would also like to have open the possibility of providing my own libuv loop that I have control over to help integrate with other event loops (e.g. Qt's event loop). This could look like:
NAPI_EXTERN napi_status napi_create_env_from_loop(int* argc, const char** argv,
napi_env* env, struct uv_loop_s* loop);
Keeping the event loop going (using uv_run
on the env's loop) would then be the embedder's responsibility.
Also, right now methods like node::CreateEnvironment
seem to always jump into a REPL, unless you provide a string to evaluate or a file to run. Tweaks to help make this nicer to use for embedding will have to be made.
These APIs are just hypothetical, and will probably change when an actual attempt to implement them is made.
I am up to trying to implement this, but I would like to see what kind of discussion happens first and what other ideas people have before I start.
Implementation Progress
- Create a clean non-NAPI way to use Node embedded
- Create NAPI functions for creating and managing environments
- Create a NAPI function for evaluating a string (exists in NAPI v1:
napi_run_script
) - Create a NAPI function for running a script from file
- Investigate if this can play nicely with
worker_threads
.
Describe alternatives you've considered
I've tried using the unstable APIs, and they aren't fun to keep up with 😅
For discussions on how the shared library can be distributed, see this issue: #24028
I've tried using the unstable APIs, and they aren't fun to keep up with 😅
A big part of that is that they haven’t ever been designed as a coherent API (or designed at all, really), and we would likely need to iterate on them a bit more before they are stable – which is probably also the point where we can start to talk about enabling N-API versions of them.
If you want to work on this, good starting points might be #21653 (comment), or splitting CreateEnvironment()
into a function that, well creates the Environment
, and one that calls Environment::Start()
under the hood?
Thanks for helping point where to start! I also noticed some relevant TODOs in 'node_worker' that would get resolved by more stable apis for this.
@rubys another person we should loop into discussions/team about use cases/testing/api for using Node.js as a shared library.
First observation: we should plan to move to having --shared
as the default for both CI and releases. This would make releases include a shared library that could be used by third parties. Unscientific comparison of results on Mac OS/X, the combine executable + dynamic library would be a total of 0.1% bigger than a standalone executable.
Second, I would suggest that one of the goals be to allow electron to be built using exclusively NAPI interfaces. See electron/atom/app/node_main.cc.
This means that in addition to Create and Destroy environments, there would need to be an interface to execute a script in an environment, and to evaluate an expression in that environment.
Like - making the node
command basically just be node_main.cc
that links against libnode
? Would be very nice! And would be nice to include CMake, pkgconfig modules for finding libnode that would ship with it while we're at it too.
@empyrical today if you do the following on Mac or Linux:
./configure --shared
make -j4
You end up with out/Release/node
and out/Release/libnode.67.dynlib
or out/Release/lib.target/libnode.so.67
. Adding additional NAPI apis would be straightforward; I'm merely stating that it should be goal to add enough APIs to make electron's node_main.cc
not need to depend on any other APIs.
But again, we would either need to include these libraries in the existing releases or have separate releases.
Oh - I misunderstood. I thought you meant only building --shared
version of Node, and making the node
executable you use from the cli just very small executable that links against libnode
@empyrical that's actually what --shared
does. Here are the sizes of the output files on Mac OS/X:
$ ls -l out/Release/node out/Release/libnode.67.dylib
-rwxr-xr-x 1 rubys staff 40410544 Sep 29 16:33 out/Release/libnode.67.dylib
-rwxr-xr-x 1 rubys staff 9208 Sep 29 16:33 out/Release/node
Just two quick things to note:
- I don’t know if that’s implied here, but I don’t think we can get away with a default where people have only a libnode + wrapper available as part of the release tarballs
- Using
--shared
is definitely something that embedders will tend to do more often than others, but it’s orthogonal to the Embedder API by itself
Curious for some thoughts with regards to worker_threads
: If you create multiple env
s, should they all be "main threads" with a threadid of 0 and workers for env
s would be created with a separate hypothetical API, or should the first one created be the "main thread", and subsequent ones be considered "workers" with incrementing threadids?
And should the "main thread" only be allowed to be made in the process' main thread? JS code that checks worker_threads.isMainThread
to see if it's safe to do something, e.g. call functions in a GUI binding (which typically only work in the main thread) may have issues if a "main" js thread isn't truly in the process' main thread.
Maybe there should be a NAPI function for creating a "main" env, and then a different one for subsequent ones?
Basically:
// Any more than one invocation per process would result in an error napi_status
NAPI_EXTERN napi_status napi_create_main_env(int* argc, const char** argv, napi_env* env);
// Parent env should also show up as parentPort on worker_threads
NAPI_EXTERN napi_status napi_create_env(napi_env parent_env, napi_env* env);
I don’t think we can get away with a default where people have only a libnode + wrapper available as part of the release tarballs
Why not?
Why not?
IMO:
- node executable probably enjoys the most compact binary for a language runtime of all time - no linkage dependency other than the c|c++(rt)
- embedding use cases may be too small to warrant a change in the default in favor of those.
node executable probably enjoys the most compact binary for a language runtime of all time - no linkage dependency other than the c|c++(rt)
I'm clearly not understanding the downside. How is a 4M executable better than a 9k executable plus a 4M libnode?
Alternatives:
- a 4M binary plus a 4M libnode.
- Two separate release bundles (and sets of CIs), one with a standalone binary, and one with a libnode.
downsides are mostly on unforeseen consumability issues at the end-user: for example user needing to explicitly set LD_LIBRARY_PATH
or LIBPATH
or PATH
. There could be other platform specific disparity on symbol resolutions (precedence between the launcher and the library) , issues stemming from other node processes sharing the library etc.
@gireeshpunathil others seem to manage without these problems; but in any case, what alternative would you suggest?
I don't know. In most of my interactions with embedded users in nodejs/help repo, I see they build from source - not because they don't have a libnode, but because each one of them wanted to embed node at different levels of abstractions - 2 node::Init, 3 node::Start, create re-use env
, re-enter env
, multi-isolate spawning etc. necessitate them to build from source.
Once we have normalized these into one or two or three discrete entry points, we could expose (only) those that leads to improved consumption of libnode; and that should help us take an easy decision. One obvious route is to release regular (exe) and libnode separately, against a specified version.
One obvious route is to release regular (exe) and libnode separately, against a specified version.
I agree, that is probably the best way forward. Dynamic linking can be pretty painful when copying executable files around (which even our own test suite does on a regular basis).
This discussion related to nodejs/Release#341 as well. If we had a Development kit and a Deployment kit (or equivalent) then we could add a shared library in addition to the existing exe without concern over the additional size.
I think that the shared library stuff is worth an issue of its own, imo! sadly some questions i had about what the n-api could look like got buried by this talk. (I can edit in a link to the top level issue if one exists)
edit: link to the issue: #24028
I'm clearly not understanding the downside. How is a 4M executable better than a 9k executable plus a 4M libnode?
I think people value the "single file" node binary approach. Being able to move the node executable around by itself has benefit (IMO), and for a lot of use-cases disk space is cheap, so that's less of an issue.
Two separate release bundles (and sets of CIs), one with a standalone binary, and one with a libnode.
Sounds like a win-win to me (hopefully not too much extra pain for CI / build).
Going to close this for now and remake this issue when I've got time to try and implement the N-API stuff.
I made a new discussion for the shared library stuff here for those interested: #24028
ping @nodejs/n-api, would you consider adding this to your backlog?
@empyrical I'm sorry I didn't voice my objection to some of the language and tone of comments that were made in this thread.
Personally I'd welcome it if you reopen this issue, since you raise valid points and make a well reasoned argument, that should be kept in consideration.
I was not bothered by anything anyone said, I closed it because the conversation went off topic to a different (but very important!) subject. My plans are to make a new issue for this when I have time to try and implement, and put a link to the new issue I created for working out how the shared library should be distributed so the comments are just about NAPI.
I've been wondering..do we need to attach the outline of a more embedder friendly API to the environment? Essentially there are at least multiple levels of abstractions here:
- A node that encapsulates the operations of v8 initialization, spinning off the libuv event loop, and doing a bunch of tracing etc. The embedder doesn't need to customize the engine/isolate and the even loop (like the
third_party_main.js
use cases, I think that's closer to what @rubys 's demo tries to address?) - A smaller node that leaves the JS engine/isolate and libuv eventloop customizable (that may disable a few tracing/diagnostics stuff that are VM-dependent), but need to include the Node.js native module loaders - that's probably electron, node-chakracore and IoT.js's use case, also that's probably what our worker implementation want
- An even smaller node that only encapsulates the C++ bindings and JS source (
node_javascript.cc
) and leave everything else customizable to the embedder - so it need to also exclude a lot of environment-dependent logic, that's what our potential mksnapshot and mkcodecache would need
Maybe it would help if we start out refactoring the bootstrap process to reflect the different levels of abstractions, while adding cctests for them along the way (at that point these are all internals so we are free to change them), and then creating APIs on top of that would presumably be easier. Starting out with a specific set of APIs in mind and then implementing them in a top-down manner would be harder to get done given how entangled the current bootstrapping process is and how different the use cases are for different embedders, IMO.
I'm reopening this issue as i plan to resume work on it (the API, not the shared library) when i return from nodeconf.eu.
Sounds good! 👍 Since it seems you are working on the non-NAPI apis to be used as the basis for the NAPI apis, could you look into providing your own libuv
loop for Node to use? Last time I tried that it wasn't so easy, and that code I made doesn't compile in today's Node even
@empyrical do you have a link to code that used to work, or a test case, or even a sketch of what you would like to accomplish? I plan to include test cases with my implementation so that the functions advertised to work continue to work.
I first did this way back when this was still a WIP: #6994
I basically pieced it together by taking the official V8 embedding example on the V8 website, and looking at bits of node::Start
to see if I could put in my own event loop that I control.
Looking at it again, I just used uv_default_loop
which is exactly what node uses for its main environment anyways. So all I'd want to see is some kind of ability to manually run the events in node's event loop, instead of blocking until node is done like node::Start()
normally does.
Here are some key parts:
// Create the loop
uv_loop_t *loop = uv_default_loop();
// Inside of an Isolate::Scope, I called these functions:
node::IsolateData *isolateData = node::CreateIsolateData(isolate, loop);
node::Environment *environment = node::CreateEnvironment(isolateData, context, testArgc, testArgv, 0, NULL);
// Made a QTimer on an interval to call the events in the event loop
auto timer = new QTimer();
QObject::connect(timer, &QTimer::timeout, []{
uv_run(loop, UV_RUN_NOWAIT);
});
timer->setInterval(10);
timer->start();
There is probably a less hacky way to get it into Qt's event loop, but this worked for keeping a basic node http server going inside of Qt's event loop.
The ability to provide an arbitrary loop might be useful, however, for making a node environment in another thread.
@empyrical I asked the question poorly, but you answered it. What you need to be able to do is:
- manually run the events in node's event loop, instead of blocking until node is done like
node::Start()
normally does - keeping a basic node http server going
Looking at my current demo, the first is clearly satisfied. After you call nodeSetup
, you can call nodeExecuteString
any arbitrary number of times before you call nodeTeardown
.
Not obvious from this example, each call to nodeExecuteString
blocks until all events are processed (mirroring the behavior of the node
command line). Is this sufficient for your requirements?
My preference would be to satisfy as many use cases as possible while exposing as few as possible of the implementation details, thereby insulating callers from potential future internal changes. I'll note that the node internals changed from the point where I started developing this demo to the point where I first published it, so the need to keep up with the internal changes is a real problem.
executeString
blocking until events are done does not satisfy the event loop integration I'm looking for, because if it blocked it would also block the UI. Perhaps another argument to executeString
could be provided to specify whether or not it's blocking?
while exposing as few as possible of the implementation details
I think that the UV loop is probably a safe thing to expose to embedders, because it is even exposed in NAPI:
Lines 608 to 610 in bde8eb5
Now, for testing this use case (making it work inside another event loop) I think that perhaps I could find a small c++ event loop library to try and embed node into (or even try and run the node loop from inside another UV loop!)
making it work inside another event loop
Got it. Thanks for the clarification!
Perhaps another argument to
executeString
Or perhaps another function.
I think that the UV loop is probably a safe thing to expose
Agreed. The subtle point is between allowing and requiring. Allowing access for people with use cases that require it: good. Requiring access for people who have use cases that don't require it: not so good.
Hello guys, sorry for being late but this may be interesting.
Recently I have released MetaCall (https://github.com/metacall/core). It allows to easily embed NodeJS into C. MetaCall is a library that allows calling functions between multiple languages, but NodeJS is one of them. I am using this library to build a high performance FaaS (Function as a Service).
This is an example of embedding NodeJS in C with MetaCall:
NodeJS code (nod.js):
#!/usr/bin/env node
function hello_boy(a, b) {
console.log('Hey boy!!');
return (a + b);
}
module.exports = {
hello_boy,
};
C code:
const char * scripts[] = { "nod.js" };
const enum metacall_value_id ids[] = { METACALL_DOUBLE, METACALL_DOUBLE };
metacall_load_from_file("node", scripts, 1, NULL);
void * ret = metacallt("hello_boy", ids, 3.0, 4.0);
metacall_value_to_double(ret); // 7.0
metacall_value_destroy(ret);
To me NodeJS was one of the most difficult run-time to embed because of the following reasons:
- Run-time Design: As opposite to Python or Ruby which are implemented with a library and executable separately for the run-time and the interpreter, (and the same for V8 and D8). NodeJS is bundled all in the same executable, so I had to modify the build step in order to generate a shared library that can be loaded by another executable.
- Build System: node-gyp is difficult to be integrated with CMake, which is the build system I'm using.
- Lack of embedding API: NodeJS has a great extension API (NAPI) but lacks from a good embedding API. A method like a C main (node::Start) is not enough to embed a run-time. You can check Python C API which is really flexible and well documented for embedding scenarios.
The design decisions of MetaCall are based on reduce the technical debt. So I refused to modify the NodeJS code itself. By this reason I had to find a hacky method to expose NodeJS internals. Yesterday I was researching and I found this post (nodejs/help#818 (comment)) which describes a similar method I have achieved.
The node_loader (https://github.com/metacall/core/tree/develop/source/loaders/node_loader) has three parts. The plugin (which is the loader itself), the bootstrap and the trampoline. MetaCall is based on a plugin architecture, so you need to implement a new plugin for each run-time. The plugin starts NodeJS with node::Start method and passes bootstrap.js (https://github.com/metacall/core/blob/develop/source/loaders/node_loader/bootstrap/lib/bootstrap.js) as the entry point. Here I do a nasty trick, I convert a host function (register) from a pointer into a string, in order to pass it through the bootstrap script, and the same with the pointer to node loader implementation. Inside the bootstrap I define a interface that implements the node loader interface in an higher level, using bootstrap as a sugar for the node loader plugin.
And now comes the inversion of control. From bootstrap I load trampoline (https://github.com/metacall/core/blob/develop/source/loaders/node_loader/trampoline/source/trampoline.cc), which is a NodeJS extension, so I can use C/C++ and at the same time I am able to access NodeJS internals. From here I convert the string of the register function from the host into a pointer, and then I inject (https://github.com/metacall/core/blob/bcaea1b6896229c3995e33da2754d57b6ff09a1e/source/loaders/node_loader/source/node_loader_impl.cpp#L1220) the NodeJS internals that I need to the host. In this case, the host is the node loader, which is a plugin itself, but there I have the control and I can return the control back to the real host easily.
At this point I achieved transforming the NAPI into an Embedding API, so the model gets inverted, and I can reuse the NAPI (giving me the abstraction between different NodeJS versions, following the original design decisions).
But now there is another part. the Event Loop. With this you have to be tricky too. First of all NodeJS cannot run in the same thread that MetaCall runs, because Event Loop would block the main thread. By this reason the start function (node::Start) is launched from a new thread. This new thread (https://github.com/metacall/core/blob/bcaea1b6896229c3995e33da2754d57b6ff09a1e/source/loaders/node_loader/source/node_loader_impl.cpp#L1317) then becomes blocked and NodeJS is able to run the Event Loop and the thread pool.
Now to do a call to NodeJS the only thing I have to do is to do an async call that is gonna be executed in the Event Loop, blocking the main thread. This block can be avoided too in the future, but I still did not implemented async calls in MetaCall (this means MetaCall core is not thread safe by now), but node loader is. At the end, this can be seen like an inverse promise. Calls get executed and the Event Loop gets decoupled.
Apart from this I have found many other limitations, but they are small details like finishing correctly the NodeJS run-time (https://github.com/metacall/core/blob/bcaea1b6896229c3995e33da2754d57b6ff09a1e/source/loaders/node_loader/source/node_loader_impl.cpp#L1719), or providing an standard way to access to the executable run-time name (related to libuv: https://github.com/metacall/core/blob/bcaea1b6896229c3995e33da2754d57b6ff09a1e/source/loaders/node_loader/source/node_loader_impl.cpp#L1332), or re-initializing the run-time, which breaks some asserts in the NodeJS internal isolate.
I hope this helps to improve NodeJS. And in any case, you can always use the library to embed NodeJS. I am gonna maintain the support and provide install packages for multiple platforms and front-ends for multiple languages. So you can use NodeJS from Python for example.
I'd like to ask for the current state of the refactorings to ease embedding.
I've seen many commits in master (many from @joyeecheung) but cannot
figure out the sequence of calls which would be necessary to properly initialize
an embedded node instance.
Any hint would be highly appreciated.
@darabi There are currently three parts in initializing an embedded node instance:
- Initializations that must be done once per process, currently
Line 791 in fa3eefc
node::Init()
exposed innode.h
- Initializations that need to be done prior to the creation of an
Environment
, currently encapsulated in the constructors of the internalNodeMainInstance
(for the main thread) andWorker
(including anotherWorkerThreadData
class, for worker threads) classes, this part is still changing, my plan is to create a base class for these two classes, when we figure out the right parts to encapsulated in the abstraction. Some of these initializations e.g.node::CreateIsolateData
are exposed innode.h
, but most of them are left for the embedder to customize. - Initializations of the
Environment
, currently exposed asnode::CreateEnvironment
andnode::LoadEnvironment
though they are currently still quite awkward to use (e.g. you have to include your_third_party_main.js
in your build in order to have a correct JavaScript entry point instead of going into the REPL). There are also remaining questions in terms of the functionality these should include, like whether the embedder wants to include Node.js's V8 inspector implementation or not.
Most of my recent refactoring work focus on making it easier to integrate custom V8 snapshots in Node.js, but it's also an open question whether we could allow embedders to include their own snapshots (the embedders have control over the Isolate they create, but Node.js needs to restart from the correct place and initialize the external references correctly from the snapshot).
For the current calling sequence that use mostly APIs available in node.h
, you can take a look at test/cctest/node_test_fixture.h. For other scenarios you can take a look at the NodeMainInstance
, Worker
and WorkerThreadData
class. They are currently still unstable and dedicated to internal use, though, I would like to see at least some reasonable abstractions between the main instance and the workers in place, and some proper cctests written, before considering adding any public APIs for this kind of functionality in node.h, as premature abstractions without real world use cases is not ideal - with so many things going on during bootstrapping, it's yet unclear what exactly should be customizable and in what way.
However if you just want to embed a simple Node.js instance for the main thread that does not coexist with any other ones in the same process, and do not need to customize any of the V8/ICU/OpenSSL bits or the libuv event loop, I think node::Start()
exposed in node.h
combined with your own _third_party_main.js
entry point would be do the job just fine, but there were also discussions about deprecating _third_party_main.js
.
Thank you for your detailed response! I will look at cctest.
@joyeecheung Apart from the three parts in initializing mentioned above, another case would be fork(2)ing node process after properly started and expecting child processes continue to run correctly (things might depend on if they are copy-able to child process) without a re-exec.
A similar example is libuv API uv_loop_fork
which reinitialize any kernel state necessary in the child process after a fork(2) system call.
@legendecas I have explained the same problem here: https://github.com/metacall/core#57-fork-model
As NodeJS Thread Pool is not exposed properly, it dies after doing the fork, and it cannot be handled. The only solution that I found is to intercept fork calls and de-initialize the whole run-time before forking, and initialize it after the fork in the child and the parent simultaneously.
3. Initializations of the `Environment`, currently exposed as `node::CreateEnvironment` and `node::LoadEnvironment` though they are currently still quite awkward to use (e.g. you have to include your `_third_party_main.js` in your build in order to have a correct JavaScript entry point instead of going into the REPL). There are also remaining questions in terms of the functionality these should include, like whether the embedder wants to include Node.js's V8 inspector implementation or not.
To solve this it can be also done customizing the Start arguments and passing an existing script that is located on the filesystem:
https://github.com/metacall/core/blob/4b563c67164c32bb344cf02649500da33afb1ebb/source/loaders/node_loader/source/node_loader_impl.cpp#L1362
https://github.com/metacall/core/blob/4b563c67164c32bb344cf02649500da33afb1ebb/source/loaders/node_loader/source/node_loader_impl.cpp#L1390
https://github.com/metacall/core/blob/4b563c67164c32bb344cf02649500da33afb1ebb/source/loaders/node_loader/source/node_loader_impl.cpp#L1453
https://github.com/metacall/core/blob/4b563c67164c32bb344cf02649500da33afb1ebb/source/loaders/node_loader/source/node_loader_impl.cpp#L1501
And this one is the script (that is being used as a trampoline to bootstrap NodeJS internals and expose them back into the host application):
https://github.com/metacall/core/blob/develop/source/loaders/node_loader/bootstrap/lib/bootstrap.js
@viferga I am not sure if adding more arguments to node::Start
is the right way to go...it works fine it all you want to customize are just the cli arguments and the entry point, and you assume there is only one Node.js instance that owns the configuration of the entire process, but this leaves very little room to be flexible if later someone wants other parts to be configurable as well, so I suspect they would still have to patch the source somehow if they really want to embed Node.js, which kind of defeats the purpose of exposing it as an embedder API.
As far as the current internals go, it's already fairly easy to do this customization by modifying the NodeMainInstance
class a bit so that it takes an additional path and load the main script from that instead of calling LoadEnvironment
which selects from existing main scripts embedded in the binary.
@joyeecheung One of the first design decisions of MetaCall was to avoid technical debt as much as possible. So I could figure out how to solve the problem without patching any line of NodeJS code. I tried this in 8.x and 10.x and it works like a charm. I did not tried 12.x (if you did any change it would be interesting to check it out).
I found many limitations when using LoadEnvironment (it was my first approach), it could not provide me access to the current NodeJS instance internals (like the isolate) so it was unusable at the end. Basically, it is impossible to use it to truly embed NodeJS, only if you want to run scripts in a synchronous manner (blocking the main thread, and without having any control over NodeJS), and that is too limited for my purposes.
To have access to NodeJS internals I had to create a mechanism with a trampoline that does the following:
-
Host application launch a thread with NodeJS Start (https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/source/node_loader_impl.cpp#L1552 and https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/source/node_loader_impl.cpp#L1501) passing by arguments the bootstrap.js file with some other parameters (they will be explained later).
-
NodeJS starts and loads bootstrap.js (https://github.com/metacall/core/blob/develop/source/loaders/node_loader/bootstrap/lib/bootstrap.js)
-
Loads an addon called trampoline.node implemented with N-API (https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/bootstrap/lib/bootstrap.js#L219 and https://github.com/metacall/core/blob/develop/source/loaders/node_loader/trampoline/source/trampoline.cc).
-
At this point I can access to NodeJS internals from C/C++ land (napi_env env: https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/trampoline/source/trampoline.cc#L54).
-
Trampoline receives from host application the function pointer to the callback that is going to be used to inject the NodeJS internals (https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/trampoline/source/trampoline.cc#L112)
-
Trampoline calls back to Host application injecting back the internals (https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/trampoline/source/trampoline.cc#L134)
-
After this the callback gets executed (https://github.com/metacall/core/blob/38abb39eed36a06584305653ff7378ae50b1b3e8/source/loaders/node_loader/source/node_loader_impl.cpp#L1220) and the NodeJS internals are exposed to the host application.
With this methodology it is possible to access the internals that are aren't exposed from embedding API, without touching any single code of NodeJS. This model turns out the N-API from extension into embedding API, inverting the model and allowing to call single functions in asynchronous manner (the event loop is also inverted).
Although it seems hacky, it has been tested on high performance FaaS and it works properly. It may encounter limitations over the fork model but they are also mitigated as I explained before.
Consider the case that the node.js is built as a shared library (using clang-cl or msvc), and use it in msvc/gcc, eg., in node.js + Cef. Because of the ABI incompatibility, one has to use the N-API instead of NAN/v8 to add global variables into the isolate. Since the program is not an extension, but an embedder, the program needs a napi_env to access the environment of the node.js. Apparently, the current N-API has no way to achieve this goal, but we can just add a function named napi_create_env, just as @empyrical said.
napi_status napi_create_env(napi_env* env, const char* module_filename) {
v8::Isolate* isolate = v8::Isolate::TryGetCurrent();
if (isolate) {
v8::Local<v8::Context> ctx = isolate->GetCurrentContext();
if (!ctx.IsEmpty()) {
*env = new node_napi_env__(ctx, module_filename);
return napi_status::napi_ok;
}
}
return napi_status::napi_generic_failure;
}
napi_status napi_free_env(napi_env env) {
CHECK_ENV(env);
delete env;
return napi_status::napi_ok;
}
In this way it is the embedder's responsibility to make sure there exists one isolate in this thread. This is possible in CEF since there exists one callback with signature CefRenderProcessHandler::OnContextCreated()
.
If those contributing to the discussion wanted to put together:
- list of key use cases
- minimal set of suggested node-api methods needed for each of those
That might be a good way to move the conversation forward.
I compel you to fully open source Node.js occur rather when every dependency requires it.
“I have to say that I'm amazed that there is code out there that loads one native addon from another native addon! Is it done by acquiring and then calling an instance of the require() function, or perhaps by using uv_dlopen() directly?”
"...more requests/cases where it[ otherwise] blocks adoption." - I say this exclusive nature of adoption and abstraction (and dependencies) has on the other hand opened node.js "internal" contributors to retribution for certain damages.
https://www.quora.com/unanswered/Will-the-market-crash-if-I-rebuild-Node-js-but-for-the-browser
Much less is the impetus to extend path
only if there are alternatives preventing node.js dominance, there are no alternatives for dependency-without-a-global-default-design-of-commonjs users - the use case is quite literally to
- use named exports of the modules that industries across- and within-greenfield-verticals, use,
- without a server farm.
I wonder if the node.js tool was to create an industry of serverless salaries instead of add utility (1/hour).
I have #43542 which has a completely independent partial implementation of this feature.
One big thing that is missing is the ability to drain the pending async callbacks and then to keep using the created environment - your napi_run_env
. This function is somewhat contrary to the current design principle of Node.js where once the event loop is emptied, the process exits. Still, I think that there might be a valid use case - a C++ software loads a JS plugin into a persistent environment, then starts calling async functions now and then.
But I consider it out of scope for the moment.
Also I see an API call for creating an environment out of a libuv
event loop? Is it really needed? What is the use case?
Another thing that may be possible without the API is switching the thread that calls V8 - for those using it with fibers/green threads - but this can be added later if it is deemed necessary. What is important at the moment is that nothing is missing from napi_create_environment
because this can't be changed later.
Also napi_create_platform
can probably be called something else, napi_init_engine
for example.
@empyrical @rubys @viferga @darabi @kohillyang
As part of the OSGeo's GSoC 2022 program, I have implemented a fully N-API/node-addon-api APIs for embedding Node.js in C and C++ applications that greatly reduces the boilerplate code, adds support for directly calling require
and import
from C/C++ and then interacting with JS entirely through the binary stable N-API, and even for await
of JS promises from C/C++.
The library is currently available as binaries for Ubuntu 18.04, 20.04 and 22.04 from Ubuntu PPA for the Node.js 16.x and Node.js 18.x branches.
At the moment all other OS require rebuilding.
This is currently to be considered very experimental, especially the Node.js 18 branch.
Can you please take a look and see if these new APIs suit your needs. They are not geared towards Electron which has very specific needs - they are mostly for the developer who needs to quickly embed Node.js in his application to support JS plugins for his existing C/C++ application.
If we can get enough people to use it and ensure that it doesn't break anything, the PR (which is quite sizeable) will surely get merged in Node.js 19 and everyone will benefit from having a common binary stable API for embedding Node.js.
https://github.com/mmomtchev/libnode
https://launchpad.net/~mmomtchev
Wow that's really cool. Would that make it easier to embed nodejs in a larger Android/iOS application?
@CMCDragonkai It should make it easier to use from any C/C++ application - if you are willing to try building it on a mobile device, I will be glad to hear from you - normally nothing of this PR should be platform-dependant, but at the moment only Linux (only Ubuntu to be more precise) has been thoroughly tested and used.
From my experimentation mentioned in #43542 (comment) and the earlier suggestion that we should put together:
- list of key use cases
- minimal set of suggested node-api methods needed for each of those
The first use cases would be:
- run a script without any external dependencies
- run a script with npm installed dependencies
@mmomtchev Wouldn't you be interested in joining forces? I have solved most of the problems you have without having embedding API, MetaCall supports now from NodeJS v10 to v18 (we used to support v8.x too but I have decided to drop the support in favor of safety and losing a bit of performance).
What @mhdawson mentioned is already supported by MetaCall too. And respect to what @CMCDragonkai said, MetaCall has also been tested in iOS and Android but the current build binaries are not published for those platforms yet (although it has been tested there).
Here's an old example (now there's no need for Python2.7 anymore in order to build Node, and Debian is distributing NodeJS with libnode as compiled library, so it does not require building NodeJS as shared library, but the rest should work): https://github.com/metacall/embedding-nodejs-example
Another good thing of MetaCall is that I do not touch a single line of NodeJS code, it should work as it is. It was one of the main design decisions because I do not want to maintain a port of NodeJS for embedding.
We support invoking functions and async functions, and we are implementing support for creating classes and objects from C/C++ side too: metacall/core#343
It also has extra features that improve embedding capabilities which you will hit eventually if you embed NodeJS at some point, related to the threading model etc.
@viferga Your objectives are very different than mine - in fact MetaCall should be a layer above NAPI embedding.
My objectives for NAPI embedding were:
- Be able to easily call JS code, including npm-installed modules, from C and C++ with a clean interface
- Do not require any dependencies besides the public header files - all Node.js internals are to be abstracted
- Thanks to N-API, I also got binary compatibility and C++ runtime independence, which is very nice, but this was not a requirement
- Ship a ready-to-use binary distribution for Ubuntu, compatible with the NodeSource packages
The problem with the libnode
in Debian, on which libnode
in Ubuntu is also based, is that you won't get very far without accessing Node.js internals. The package itself is of course very sleek - built by the authors of the distributions - and I did borrow parts of it - but linking npm modules and working with asynchronous code will be a major problem.
@mmomtchev most of the objectives are the same..., the only difference is that we offer a simpler API and a library on top of it.
We have achieved to properly embed node only with Debian libnode and N-API. It has been very costly but it works, that's what I mean. You can check our implementation if you want to know how we did it. Also it is explained in this post the approach we followed to properly embed it with the current limitations of node.
There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.
For more information on how the project manages feature requests, please consult the feature request management document.
The issue is not stale and the PR is up-to-date
In the last few weeks I have made some progress on addressing this issue.
See the PR #54660 (a temporary spin off from #43542).
There are still a few TODOs listed in the PR's description, but I would like to ask participants of this discussion for the early feedback. While nitpicking is welcome, I am the most interested in your scenarios.
E.g., "I have scenario XYZ, how can it be addressed with the new API?", "Did you consider the scenario Z?", "I use libnode for X and I really wish it can do Y", or any other thoughts or questions about the new API.
So far the design is being actively discussed with @jasongin, one of the original creators of Node-API, and many TODOs are based on his feedback. One of our core scenarios is to use the new API from the node-api-dotnet. It must enable use of libnode in .Net based server or client apps.
@mhdawson has provided the valuable PR feedback for an early iteration, and we had a brief discussion of it in our latest @nodejs/node-api meeting.
The PR has a relatively long description, all new APIs are documented, and there are several unit tests that exercise the APIs.