During using interlink app, always appear Segmentation fault
wangchencheng93 opened this issue · comments
In receiver script, i only read data from interlink-receiver and send them to sink app.
In transmitter script, i get data from Intel app and send it to Transmitter.
why always appears Segmentation fault.
Does sudo apps/interlink/selftest.snabb
work for you? It should be essentially equivalent except that it uses a Source
app instead of a NIC driver.
Thank you for your reply.
yes, I remember that i have run apps/interlink/selftest.snabb
and it worked. But now it does not. The Segmentation fault
has occurred again.
So I have try this on the other server which has the same OS version, and it worked.
Interestingly, if I use a pcap.PcapReader
app instead of a Source
app, it sometimes worked and sometimes occurred Segmentation fault
. why did this happened?
I am eager to solve this problem.
Could you include an example that triggers the bug for you? I can’t reproduce the segfault using i.e.,:
diff --git a/src/apps/interlink/test_source.lua b/src/apps/interlink/test_source.lua
index cfa71b741..fdb8bbf7b 100644
--- a/src/apps/interlink/test_source.lua
+++ b/src/apps/interlink/test_source.lua
@@ -3,13 +3,16 @@
module(...,package.seeall)
local Transmitter = require("apps.interlink.transmitter")
-local Source = require("apps.basic.basic_apps").Source
+local PcapReader = require("apps.pcap.pcap").PcapReader
+local Repeater = require("apps.basic.basic_apps").Repeater
function start (name)
local c = config.new()
config.app(c, name, Transmitter)
- config.app(c, "source", Source)
- config.link(c, "source.output -> "..name..".input")
+ config.app(c, "source", PcapReader, "apps/packet_filter/samples/v4.pcap")
+ config.app(c, "repeat", Repeater)
+ config.link(c, "source.output -> repeat.input")
+ config.link(c, "repeat.output -> "..name..".input")
engine.configure(c)
engine.main()
end
Also, Snabb should log the segfault like so:
snabb[7551]: segfault at 0x30 ip 0x5579d291e7fc sp 0x7ffc09f5f6c0 code 1 errno 0
This information would be valuable to have.
If you have a clean copy of the master branch and apps/interlink/selftest.snabb
fails, I would be interested in the output of sudo strace -f apps/interlink/selftest.snabb
and exact OS and kernel version.
Sorry, I made a mistake. I thought it worked, but in fact it didn't. Because I use rss
app before interlink.transmitter
app and sometimes the packets didn't be distributed to interlink.transmitter
.
The Servers’ OS is both CentOS Linux release 7.3.1611 (Core)
, the kernel version of one which can run apps/interlink/selftest.snabb
normally is 4.14.80-1.el7.centos.x86_64
and the other one is 3.10.0-514.26.2.el7.x86_64
I run the apps/interlink/selftest.snabb
and Snabb log the segfault like this:
snabb[64926]: segfault at 0x50015983e100 ip 0x4465b6 sp 0x7fffa3e37a90 code 1 errno 0
The output of strace -f apps/interlink/selftest.snabb
is as follow:
out.log
[pid 65058] mmap(0x500159800000, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED|0x40000, 4, 0) = -1 EINVAL (Invalid argument)
[pid 65060] mprotect(0x2a4f0000, 65536, PROT_READ|PROT_EXEC <unfinished ...>
[pid 65058] write(2, "snabb[65058]: segfault at 0x5001"..., 86snabb[65058]: segfault at 0x50015983e100 ip 0x4465b6 sp 0x7ffe718d9dd0 code 1 errno 0
This seems to be the relevant part of the strace output. mmap
fails with EINVAL
, and thus our SEGSEGV handler punts.
I think this is the same problem as in #1210, which led to a fix in core/memory.lua
(#1373). Seems like we forgot to apply the fix to the SIGSEGV handler that handles inter-process memory sharing as well.
Could you try running with this patch and see if that fixes it?
diff --git a/src/core/memory.c b/src/core/memory.c
index 190873285..03a43d898 100644
--- a/src/core/memory.c
+++ b/src/core/memory.c
@@ -73,7 +73,7 @@ static void memory_sigsegv_handler(int sig, siginfo_t *si, void *uc)
}
// Map the memory at the expected address
if (mmap((void *)(page | TAG), st.st_size, PROT_READ|PROT_WRITE,
- MAP_SHARED|MAP_FIXED|MAP_HUGETLB, fd, 0) == MAP_FAILED) {
+ MAP_SHARED|MAP_FIXED, fd, 0) == MAP_FAILED) {
goto punt;
}
close(fd);
This problem has been solved.
I truly appreciate your timely help.
My pleasure!