snabbco / snabb

Snabb: Simple and fast packet networking

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

During using interlink app, always appear Segmentation fault

wangchencheng93 opened this issue · comments

In receiver script, i only read data from interlink-receiver and send them to sink app.
In transmitter script, i get data from Intel app and send it to Transmitter.
why always appears Segmentation fault.

Does sudo apps/interlink/selftest.snabb work for you? It should be essentially equivalent except that it uses a Source app instead of a NIC driver.

Thank you for your reply.
yes, I remember that i have run apps/interlink/selftest.snabb and it worked. But now it does not. The Segmentation fault has occurred again.
So I have try this on the other server which has the same OS version, and it worked.
Interestingly, if I use a pcap.PcapReader app instead of a Source app, it sometimes worked and sometimes occurred Segmentation fault. why did this happened?
I am eager to solve this problem.

Could you include an example that triggers the bug for you? I can’t reproduce the segfault using i.e.,:

diff --git a/src/apps/interlink/test_source.lua b/src/apps/interlink/test_source.lua
index cfa71b741..fdb8bbf7b 100644
--- a/src/apps/interlink/test_source.lua
+++ b/src/apps/interlink/test_source.lua
@@ -3,13 +3,16 @@
 module(...,package.seeall)
 
 local Transmitter = require("apps.interlink.transmitter")
-local Source = require("apps.basic.basic_apps").Source
+local PcapReader = require("apps.pcap.pcap").PcapReader
+local Repeater = require("apps.basic.basic_apps").Repeater
 
 function start (name)
    local c = config.new()
    config.app(c, name, Transmitter)
-   config.app(c, "source", Source)
-   config.link(c, "source.output -> "..name..".input")
+   config.app(c, "source", PcapReader, "apps/packet_filter/samples/v4.pcap")
+   config.app(c, "repeat", Repeater)
+   config.link(c, "source.output -> repeat.input")
+   config.link(c, "repeat.output -> "..name..".input")
    engine.configure(c)
    engine.main()
 end

Also, Snabb should log the segfault like so:

snabb[7551]: segfault at 0x30 ip 0x5579d291e7fc sp 0x7ffc09f5f6c0 code 1 errno 0

This information would be valuable to have.

If you have a clean copy of the master branch and apps/interlink/selftest.snabb fails, I would be interested in the output of sudo strace -f apps/interlink/selftest.snabb and exact OS and kernel version.

Sorry, I made a mistake. I thought it worked, but in fact it didn't. Because I use rss app before interlink.transmitter app and sometimes the packets didn't be distributed to interlink.transmitter.

The Servers’ OS is both CentOS Linux release 7.3.1611 (Core) , the kernel version of one which can run apps/interlink/selftest.snabb normally is 4.14.80-1.el7.centos.x86_64 and the other one is 3.10.0-514.26.2.el7.x86_64
I run the apps/interlink/selftest.snabb and Snabb log the segfault like this:

snabb[64926]: segfault at 0x50015983e100 ip 0x4465b6 sp 0x7fffa3e37a90 code 1 errno 0

The output of strace -f apps/interlink/selftest.snabb is as follow:
out.log

[pid 65058] mmap(0x500159800000, 2097152, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED|0x40000, 4, 0) = -1 EINVAL (Invalid argument)
[pid 65060] mprotect(0x2a4f0000, 65536, PROT_READ|PROT_EXEC <unfinished ...>
[pid 65058] write(2, "snabb[65058]: segfault at 0x5001"..., 86snabb[65058]: segfault at 0x50015983e100 ip 0x4465b6 sp 0x7ffe718d9dd0 code 1 errno 0

This seems to be the relevant part of the strace output. mmap fails with EINVAL, and thus our SEGSEGV handler punts.

I think this is the same problem as in #1210, which led to a fix in core/memory.lua (#1373). Seems like we forgot to apply the fix to the SIGSEGV handler that handles inter-process memory sharing as well.

Could you try running with this patch and see if that fixes it?

diff --git a/src/core/memory.c b/src/core/memory.c
index 190873285..03a43d898 100644
--- a/src/core/memory.c
+++ b/src/core/memory.c
@@ -73,7 +73,7 @@ static void memory_sigsegv_handler(int sig, siginfo_t *si, void *uc)
   }
   // Map the memory at the expected address
   if (mmap((void *)(page | TAG), st.st_size, PROT_READ|PROT_WRITE,
-           MAP_SHARED|MAP_FIXED|MAP_HUGETLB, fd, 0) == MAP_FAILED) {
+           MAP_SHARED|MAP_FIXED, fd, 0) == MAP_FAILED) {
     goto punt;
   }
   close(fd);

This problem has been solved.
I truly appreciate your timely help.

My pleasure!