wandenberg / nginx-push-stream-module

A pure stream http push technology for your Nginx setup. Comet made easy and really scalable.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Apple Silicon "ngz_slab_alloc() failed: no memory"

TheBerg opened this issue Β· comments

I am working on getting our infrastructure running on Apple Silicon. When I compile nginx with nginx-push-stream-module I get the above error, when I compile it without it, it starts fine. Any ideas?

Do you know in which part it gives you this failure?
It looks like you configured the module with a very small shared memory size for this architecture.
Check this configuration and start it again.
Based on your description is not a matter of compiling or not the Nginx with the module. The server would also start when you compile with the module but does not add any configuration to it on your conf file.

Hmmm...I am not sure. Let me give some more information. This is how we are compiling it:

./configure --prefix=/usr/local/nginx --with-http_ssl_module --with-pcre --with-ipv6 --sbin-path=/usr/local/nginx/bin/nginx --with-cc-opt="-I/usr/local/include -I/opt/homebrew/Cellar/pcre/8.44/include -I/opt/homebrew/Cellar/openssl@1.1/1.1.1i/include" --with-ld-opt="-L/usr/local/lib -L/opt/homebrew/Cellar/pcre/8.44/lib -L/opt/homebrew/Cellar/openssl@1.1/1.1.1i/lib" --conf-path=/usr/local/etc/nginx/nginx.conf --pid-path=/usr/local/var/run/nginx.pid --lock-path=/usr/local/var/run/nginx.lock --http-client-body-temp-path=/usr/local/var/run/nginx/client_body_temp --http-proxy-temp-path=/usr/local/var/run/nginx/proxy_temp --http-fastcgi-temp-path=/usr/local/var/run/nginx/fastcgi_temp --http-uwsgi-temp-path=/usr/local/var/run/nginx/uwsgi_temp --http-scgi-temp-path=/usr/local/var/run/nginx/scgi_temp --http-log-path=/usr/local/var/log/nginx/access.log --error-log-path=/usr/local/var/log/nginx/error.log --add-module=/tmp/nginx-push-stream-module && make -j2 && sudo make install

This is my Nginx.conf

worker_processes 1;

events {
    worker_connections 1024;
}

http {
    include mime.types;
    default_type application/octet-stream;
    sendfile on;
    keepalive_timeout 65;
}

Running Nginx after this will result in:

nginx: [crit] ngx_slab_alloc() failed: no memory

If I remove --add-module=/tmp/nginx-push-stream-module, Nginx will boot up well.

Any suggestions for debugging this are appreciated. Thank you so much for your time.

Try to do your configuration like this

events {
    worker_connections 1024;
}

http {
    include mime.types;
    default_type application/octet-stream;
    sendfile on;
    keepalive_timeout 65;
    push_stream_shared_memory_size 32M;
}

I get the same error with that set to 32M, 128MB and 1MB (tried the gamut, πŸ˜‚)

Screen Shot 2021-01-06 at 1 54 31 PM

Here is a screenshot of my system, not sure if that helps.

Not too much. If the issue is connected to the Apple M1 chip, I do not have a way to reproduce it, mine is an Intel.
Would be possible for you to add some log messages to investigate at which moment it is crashing? (I would ask for a coredump, but I don't think I will be able to read it being in another architecture, but we can try)

Okay, so I have tried everything, I can't get any more interesting logs or anything.

% nginx -V
nginx version: nginx/1.19.6
built by clang 12.0.0 (clang-1200.0.32.28)
built with OpenSSL 1.1.1i  8 Dec 2020
TLS SNI support enabled
configure arguments: --prefix=/usr/local/nginx --with-http_ssl_module --with-pcre --with-ipv6 --sbin-path=/usr/local/nginx/bin/nginx --with-cc-opt='-I/usr/local/include -I/opt/homebrew/Cellar/pcre/8.44/include -I/opt/homebrew/Cellar/openssl@1.1/1.1.1i/include' --with-ld-opt='-L/usr/local/lib -L/opt/homebrew/Cellar/pcre/8.44/lib -L/opt/homebrew/Cellar/openssl@1.1/1.1.1i/lib' --conf-path=/usr/local/etc/nginx/nginx.conf --pid-path=/usr/local/var/run/nginx.pid --lock-path=/usr/local/var/run/nginx.lock --http-client-body-temp-path=/usr/local/var/run/nginx/client_body_temp --http-proxy-temp-path=/usr/local/var/run/nginx/proxy_temp --http-fastcgi-temp-path=/usr/local/var/run/nginx/fastcgi_temp --http-uwsgi-temp-path=/usr/local/var/run/nginx/uwsgi_temp --http-scgi-temp-path=/usr/local/var/run/nginx/scgi_temp --http-log-path=/usr/local/var/log/nginx/access.log --error-log-path=/usr/local/var/log/nginx/error.log --with-debug --add-module=/tmp/nginx-push-stream-module

My config:

worker_processes   auto;
error_log          /usr/local/var/log/nginx/debug.log debug;

events { }

http { }

note: if I remove http { } nginx boots.

gdb is not available on Apple Silicon yet. I will keep you updated when that is available.

But still getting nothing. Have any suggestions on how I might be able to get more information for us to debug with?

Let's do a test first to identify if the issue is really on the push module or in the usage of shared memory.
Please, compile Nginx without Push module and try to start it with a proxy cache configuration like this

proxy_cache_path /tmp/cache levels=1:2 keys_zone=my_cache:100m max_size=10g inactive=60m use_temp_path=off;

If this gives you a similar error means that we have a bigger problem.
Also, if you want too, we can try to do a session using a TeamViewer like.

Well, good news, it did not give me an error with the changes above.

I am totally down to get on a Team Viewer. Wanna DM me on twitter and we can set something up? twitter.com/theberg

Did you get anywhere with this - trying to help some colleagues with the same issue πŸ€žπŸ™

Hi @jaygooby Unfortunately, was not able to reach @TheBerg out, so not possible to debug the issue at the Apple M1 chipset.
Do you know if your colleagues are able to work with me on this? I kind of need a faster communication way where I can setup some debugs and find out where the issue is.

I've borrowed one of their M1's - you can DM me on Twitter - I've requested to follow your locked account

Although I haven't got any closer to getting this running under native Apple silicon, if you install rosetta2 https://support.apple.com/en-us/HT211861 you get the arch command which lets you emulate the intel architecture for the build process. nginx and this module, built using Rosetta works.

I've been using my build-nginx project to work on this issue, and using the minimal config file that @TheBerg mentions I see the same ngx_slab_alloc() failed error, but if I build with rosetta, then nginx runs:

arch -x86_64 ~/src/build-nginx/build-nginx -n https://github.com/nginx/nginx@release-1.19.6 -m https://github.com/wandenberg/nginx-push-stream-module.git@0.5.4 -d https://ftp.exim.org/pub/pcre/pcre-8.41.tar.gz -o --with-debug

This is the config I'm testing with:

# "mkdir /tmp/nginx" then save it as 
# /tmp/nginx/nginx.conf 
worker_processes   auto;
error_log  /tmp/nginx/error.log debug;
error_log  /tmp/nginx/nginx.pid;
events { }
http { }

Start nginx like this (the path comes from the build-nginx tool):

~/.build-nginx/nginx-release-1.19.6/objs/nginx -p /tmp/nginx -g "error_log /tmp/nginx/error.log;" -e /tmp/nginx/error.log -c /tmp/nginx/nginx.conf

Any updates on this issue? we are also seeing it. Thanks for looking at it!

Sorry, not yet. @wandenberg and I had a quick DM back and forth. My gut feeling is that it's something to do with how the M1 Macs have changed the way they set or measure shared memory size. Perhaps a bunch of debugging in

ngx_http_push_stream_set_shm_size_slot(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)

Thank you for the update. If there's something I or others can do to help debug, please let us know!

I had success getting this to work through some aggressive printf() debugging. πŸ˜‰

I admit I do not understand exactly what this code is doing, but it seems to be thrown off by the larger page size of the ARM system (16384) vs the Intel system (4096). The larger number causes a later division operation to return 0, where it should be a 1 or a 2? Or something like that... Again, I don't know what I'm doing.

But hopefully the information below will help someone who does know what they're doing to create a proper patch. In the meantime, I present my patch and my debug logs here...

Patch

--- src/ngx_http_push_stream_module_setup.c
+++ src/ngx_http_push_stream_module_setup.c
@@ -356,7 +356,7 @@ ngx_http_push_stream_exit_worker(ngx_cycle_t *cycle)
 static ngx_int_t
 ngx_http_push_stream_preconfig(ngx_conf_t *cf)
 {
-    size_t size = ngx_align(2 * sizeof(ngx_http_push_stream_global_shm_data_t), ngx_pagesize);
+    size_t size = ngx_align(4 * sizeof(ngx_http_push_stream_global_shm_data_t), ngx_pagesize);
     ngx_shm_zone_t     *shm_zone = ngx_shared_memory_add(cf, &ngx_http_push_stream_global_shm_name, size, &ngx_http_push_stream_module);
 
     if (shm_zone == NULL) {

Printf Output

[ARM] Before the patch, "no memory" error (Darwin 21.1.0 arm64):

init ngx_pagesize = 16384 at src/os/unix/ngx_posix_init.c:54
size = 16384 at /tmp/nginx-push-stream-module/src/ngx_http_push_stream_module_setup.c:360
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 16240 at src/core/ngx_slab.c:113
size = 15624 at src/core/ngx_slab.c:136
ngx_pagesize = 16384 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 0 at src/core/ngx_slab.c:139
15624 / (16384 + 24) at src/core/ngx_slab.c:140
checking page 0x1034582f8 at src/core/ngx_slab.c:689
page->next = 0x103458028 at src/core/ngx_slab.c:690
&pool->free = 0x103458028 at src/core/ngx_slab.c:691
page->slab = 0 at src/core/ngx_slab.c:692
pages = 1 at src/core/ngx_slab.c:693
nginx: [crit] ngx_slab_alloc() failed: no memory

[x64] Working Intel-based system, no patch needed (Darwin 21.1.0 x86_64):

init ngx_pagesize = 4096 at src/os/unix/ngx_posix_init.c:54
size = 12288 at /tmp/nginx-push-stream-module/src/ngx_http_push_stream_module_setup.c:360
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 12144 at src/core/ngx_slab.c:113
size = 11640 at src/core/ngx_slab.c:136
ngx_pagesize = 4096 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 2 at src/core/ngx_slab.c:139
11640 / (4096 + 24) at src/core/ngx_slab.c:140
checking page 0xbbb288 at src/core/ngx_slab.c:689
page->next = 0xbbb028 at src/core/ngx_slab.c:690
&pool->free = 0xbbb028 at src/core/ngx_slab.c:691
page->slab = 2 at src/core/ngx_slab.c:692
pages = 2 at src/core/ngx_slab.c:693
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 33554288 at src/core/ngx_slab.c:113
size = 33553784 at src/core/ngx_slab.c:136
ngx_pagesize = 4096 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 8144 at src/core/ngx_slab.c:139
33553784 / (4096 + 24) at src/core/ngx_slab.c:140
checking page 0x11e3288 at src/core/ngx_slab.c:689
page->next = 0x11e3028 at src/core/ngx_slab.c:690
&pool->free = 0x11e3028 at src/core/ngx_slab.c:691
page->slab = 8144 at src/core/ngx_slab.c:692
pages = 15 at src/core/ngx_slab.c:693
checking page 0x11e33f0 at src/core/ngx_slab.c:689
page->next = 0x11e3028 at src/core/ngx_slab.c:690
&pool->free = 0x11e3028 at src/core/ngx_slab.c:691
page->slab = 8129 at src/core/ngx_slab.c:692
pages = 1 at src/core/ngx_slab.c:693

[ARM] After the patch, works! (Darwin 21.1.0 arm64):

init ngx_pagesize = 16384 at src/os/unix/ngx_posix_init.c:54
size = 32768 at /tmp/nginx-push-stream-module/src/ngx_http_push_stream_module_setup.c:360
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 32624 at src/core/ngx_slab.c:113
size = 32008 at src/core/ngx_slab.c:136
ngx_pagesize = 16384 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 1 at src/core/ngx_slab.c:139
32008 / (16384 + 24) at src/core/ngx_slab.c:140
checking page 0x102bb42f8 at src/core/ngx_slab.c:689
page->next = 0x102bb4028 at src/core/ngx_slab.c:690
&pool->free = 0x102bb4028 at src/core/ngx_slab.c:691
page->slab = 1 at src/core/ngx_slab.c:692
pages = 1 at src/core/ngx_slab.c:693

[x64] After the patch, still works (Darwin 21.1.0 x86_64)

init ngx_pagesize = 4096 at src/os/unix/ngx_posix_init.c:54
size = 20480 at /tmp/nginx-push-stream-module/src/ngx_http_push_stream_module_setup.c:360
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 20336 at src/core/ngx_slab.c:113
size = 19832 at src/core/ngx_slab.c:136
ngx_pagesize = 4096 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 4 at src/core/ngx_slab.c:139
19832 / (4096 + 24) at src/core/ngx_slab.c:140
checking page 0x100db288 at src/core/ngx_slab.c:689
page->next = 0x100db028 at src/core/ngx_slab.c:690
&pool->free = 0x100db028 at src/core/ngx_slab.c:691
page->slab = 4 at src/core/ngx_slab.c:692
pages = 2 at src/core/ngx_slab.c:693
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 33554288 at src/core/ngx_slab.c:113
size = 33553784 at src/core/ngx_slab.c:136
ngx_pagesize = 4096 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 8144 at src/core/ngx_slab.c:139
33553784 / (4096 + 24) at src/core/ngx_slab.c:140
checking page 0x10703288 at src/core/ngx_slab.c:689
page->next = 0x10703028 at src/core/ngx_slab.c:690
&pool->free = 0x10703028 at src/core/ngx_slab.c:691
page->slab = 8144 at src/core/ngx_slab.c:692
pages = 15 at src/core/ngx_slab.c:693
checking page 0x107033f0 at src/core/ngx_slab.c:689
page->next = 0x10703028 at src/core/ngx_slab.c:690
&pool->free = 0x10703028 at src/core/ngx_slab.c:691
page->slab = 8129 at src/core/ngx_slab.c:692
pages = 1 at src/core/ngx_slab.c:693

@seven1m Thanks for taking the time to investigate. It was harder for me to investigate without the hardware.
I will review your logs double-check the solution and release a new version today.

@seven1m do you mind repeating your tests with the following patch?
I am checking what is the best approach to it work in a cross-platform way going forward.

diff --git a/src/ngx_http_push_stream_module_setup.c b/src/ngx_http_push_stream_module_setup.c
index b502cea..f5f4429 100644
--- a/src/ngx_http_push_stream_module_setup.c
+++ b/src/ngx_http_push_stream_module_setup.c
@@ -356,7 +356,7 @@ ngx_http_push_stream_exit_worker(ngx_cycle_t *cycle)
 static ngx_int_t
 ngx_http_push_stream_preconfig(ngx_conf_t *cf)
 {
-    size_t size = ngx_align(2 * sizeof(ngx_http_push_stream_global_shm_data_t), ngx_pagesize);
+    size_t size = ngx_align(ngx_max(2 * sizeof(ngx_http_push_stream_global_shm_data_t), ngx_pagesize), ngx_pagesize);
     ngx_shm_zone_t     *shm_zone = ngx_shared_memory_add(cf, &ngx_http_push_stream_global_shm_name, size, &ngx_http_push_stream_module);
 
     if (shm_zone == NULL) {

@wandenberg I'm sorry, but I still get the same nginx: [crit] ngx_slab_alloc() failed: no memory error with that patch.

My debug output looks like this:

init ngx_pagesize = 16384 at src/os/unix/ngx_posix_init.c:54
size = 16384 at /tmp/nginx-push-stream-module/src/ngx_http_push_stream_module_setup.c:360
running ngx_slab_init(sp) at src/core/ngx_cycle.c:934
init size = 16240 at src/core/ngx_slab.c:113
size = 15624 at src/core/ngx_slab.c:136
ngx_pagesize = 16384 at src/core/ngx_slab.c:137
sizeof(ngx_slab_page_t) = 24 at src/core/ngx_slab.c:138
pages = 0 at src/core/ngx_slab.c:139
15624 / (16384 + 24) at src/core/ngx_slab.c:140
checking page 0x100a1c2f8 at src/core/ngx_slab.c:689
page->next = 0x100a1c028 at src/core/ngx_slab.c:690
&pool->free = 0x100a1c028 at src/core/ngx_slab.c:691
page->slab = 0 at src/core/ngx_slab.c:692
pages = 1 at src/core/ngx_slab.c:693
nginx: [crit] ngx_slab_alloc() failed: no memory

I believe the main culprit is the 15624 / (16384 + 24) at src/core/ngx_slab.c:140 line above, which corresponds to line 134 in ngx_slab.c, which you can read here: https://github.com/nginx/nginx/blob/3334585539168947650a37d74dd32973ab451d70/src/core/ngx_slab.c#L134

pages = (ngx_uint_t) (size / (ngx_pagesize + sizeof(ngx_slab_page_t)));

Substituting the values from the debug output, that line becomes:

pages = (ngx_uint_t) (15624 / (16384 + 24));

...which would result in a floating point value of 0.95221843, but since pages is a ngx_uint_t, it gets truncated to just 0.

I tweaked your patch a bit and came up with this:

diff --git a/src/ngx_http_push_stream_module_setup.c b/src/ngx_http_push_stream_module_setup.c
index b502cea..f5f4429 100644
--- a/src/ngx_http_push_stream_module_setup.c
+++ b/src/ngx_http_push_stream_module_setup.c
@@ -356,7 +356,7 @@ ngx_http_push_stream_exit_worker(ngx_cycle_t *cycle)
 static ngx_int_t
 ngx_http_push_stream_preconfig(ngx_conf_t *cf)
 {
-    size_t size = ngx_align(2 * sizeof(ngx_http_push_stream_global_shm_data_t), ngx_pagesize);
+    size_t size = ngx_align(ngx_max(2 * sizeof(ngx_http_push_stream_global_shm_data_t), 2 * ngx_pagesize), ngx_pagesize);
     ngx_shm_zone_t     *shm_zone = ngx_shared_memory_add(cf, &ngx_http_push_stream_global_shm_name, size, &ngx_http_push_stream_module);
 
     if (shm_zone == NULL) {

On ARM, this produces a size of 32768 (2x the page size), and on Intel it produces a size of 12288 (3x the page size). In both cases, the value is large enough to be divisible by (ngx_pagesize + 24) and produce something greater than 0.

I tested this patch on both systems and nginx starts up without error.

If it seems good, can we have a merge and release? :D

Hi @seven1m Thanks again for the help on the analysis.
I will go with your solution, just doing a small adjustment on the line. The result must be the same, but can you double-check before I merge to master? I just put the 2 * in evidence.

size_t size = ngx_align(2 * ngx_max(sizeof(ngx_http_push_stream_global_shm_data_t), ngx_pagesize), ngx_pagesize);

@wandenberg this works great! I tested the patch on macOS x86_64, macOS ARM, and Ubuntu x86_64.

Nginx boots, and push stream works as advertised on all three platforms. πŸŽ‰

Just tested on Raspberry Pi 3 (Model B) ARM - also working πŸŽ‰

If everybody is happy with it, is it possible to get a merge and release of the code that works? :D thanks!

Thank you, friend!

thank you for the fix!

@wandenberg sorry to bug you, but will there be a 0.5.5 release with this fix? thanks again!