RobertCNelson / bb-kernel

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add patch to resolve kernel panic when using CAN devices

psyklopz opened this issue · comments

This affects the 3.8-13-boneX releases. There is an issue with the c_can.ko kernel object, which was corrected in version 3.11 of the Linux kernel. The link below includes a sample of the crash, along with how to resolve it.
http://socket-can.996257.n3.nabble.com/Beaglebone-Black-System-Crash-using-SocketCAN-td7702.html

I have a patch file created and tested. I can verify that this resolves the issue.
https://github.com/psyklopz/bb-kernel/blob/3.8.13-bone72/patches/net/0010-backport-patch-to-fix-kernel-panic-caused-by-c_can-driver.patch

I would like to submit a pull request to you, but want to work with you so that it can be done in the format that you desire. Below are additional links of users experiencing the same problem. I know this affects commercially-available CAN capes as well. Please let me know how I can be of assistance.

http://comments.gmane.org/gmane.linux.can/5151
https://groups.google.com/forum/#!topic/beagleboard/JL6aSRc0b7E

Nice find! If you want, i can just take the patch as is:

https://github.com/psyklopz/bb-kernel/blob/3.8.13-bone72/patches/net/0010-backport-patch-to-fix-kernel-panic-caused-by-c_can-driver.patch

and integrate it into the patch script. (i'll also push it out as bone72 shortly)

Regards,

oh. very interested in getting this sorted.

I think this will likely affect the rt-can RTDM driver in the Xenomai variant as well which is based on the same code.

I have it in the patch script as well, on a branch on my github. Also did the favor of bumping the version number.
https://github.com/psyklopz/bb-kernel/tree/3.8.13-bone72

I'll just cherry pick that.. ( i have a few things queued after bone71. ;) )

and pushed out:

https://github.com/RobertCNelson/bb-kernel/commits/am33x-v3.8

and queued up on the builder...

https://github.com/beagleboard/linux/tree/3.8 (well shortly, just waiting for 1000 git am's)

Regards,

I've notified Steve for the RT-CAN side
eventually we'll have to patch both the socketCAN and RT-CAN drivers in the Xenomai tree

I'll build bone72 just as you have released it, and verify that everything works (as it should!). Expect me to close this issue sometime tomorrow.

Looks like the relevant change is that the check for "EoB" (End of Block) was previously happening before checking for "Message Lost" on each message object, and if EoB was encountered you'd return from the do_rx_poll routine rather than moving on to the next message object.. So maybe it prevents us from actually handling the "message lost" case.

I'm also not 100% sure if this really affects the RT version because of possible differences between RT Socket CAN and regular Socket CAN.

Yes, the merged in code does switch the order of the EoB check and the Message Lost check, which was another change made for the 3.11 kernel. However, it also fixes an issue where pm_runtime_get_sync() was called from an interrupt. Apparently, this is not allowed and triggers the kernel panic.

10/31/13 Commit, fixes RX handling
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/can/c_can/c_can.c?id=5d0f801a2ccec3b1fdabc3392c8d99ed0413d216

11/25/13 Commit, fixes kernel panic
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/can/c_can/c_can.c?id=e35d46adc49b469fd92bdb64fea8af93640e6651

Thanks. In my RTDM port, that call to pm runtime get sync that got removed
is no longer there because of differences from socket can to rt socket can,
so I hadn't mentioned that part in my comment.

Steve
On Jun 15, 2015 5:29 PM, "Eric Wright" notifications@github.com wrote:

Yes, the merged in code does switch the order of the EoB check and the
Message Lost check, which was another change made for the 3.11 kernel.
However, it also fixes an issue where pm_runtime_get_sync() was called from
an interrupt. Apparently, this is not allowed and triggers the kernel panic.

10/31/13 Commit, fixes RX handling

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/can/c_can/c_can.c?id=5d0f801a2ccec3b1fdabc3392c8d99ed0413d216

11/25/13 Commit, fixes kernel panic

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/can/c_can/c_can.c?id=e35d46adc49b469fd92bdb64fea8af93640e6651


Reply to this email directly or view it on GitHub
#13 (comment)
.