sakaki- / gentoo-on-rpi-64bit

Bootable 64-bit Gentoo image for the Raspberry Pi4B, 3B & 3B+, with Linux 5.4, OpenRC, Xfce4, VC4/V3D, camera and h/w codec support, weekly-autobuild binhost

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pi becomes unresponsive after a while.

benjaminmordaunt opened this issue · comments

I compiled a CPU miner used it for about 4-5 hours, but then the Pi becomes unresponsive and I can no longer SSH in, and the mining stops.

I'm using the official power supply so I can't see how it would be a power issue. Would overheating cause the Pi to just lock up like that? The official Raspbian distribution seems to work just fine indefinitely running.

Could you maybe suggest where I could find some logs as to what's happening?

Thanks.

Try looking in /var/log/messages when you reboot. There may be something there.
Are you running the GUI when the Pi locks up? If so, try running:

sudo /etc/init.d/xdm stop

Then log in at the text console, and start your miner from there. Do you get the same issue?
Another thing (if you don't already do it) would be to prefix the miner launch with nice -n 19, which will run it at the lowest system priority.
Also, check that you aren't running out of memory. Add another swapfile if necessary.

Already ran
sudo /etc/init.d/xdm stop
as well as
sudo chmod -x /etc/init.d/xdm*

Doesn't seem to be a problem with memory, according to top, the miner uses a consistent 0.7% of RAM throughout its execution.
Setting a lower priority seems counterintuitive for a mining application, but could definitely help.

Unfortunately, can't provide logs or anything because I've switched to pi64 (Debian) for the time being. Everything seems to be working just fine.

Thanks anyway.

Closing this, because it's likely the fault is on my end. Would be beneficial to look into it further though - could be a throttling/power save or general kernel power issue.

OK, sorry to hear you had problems with it ><

PS the reason to suggest nice was to ensure that sshd definitely doesn't get locked out; with nothing else running, the miner would still get full use of the CPU.

If you have no GUI running, not sure what could be causing the issue. I run a RPi3 build server that (this being Gentoo ^-^) spends pretty much all day every day compiling new packages, and have had few problems with it.

What mining software were you running on it btw? If I get a chance I'll set up a test system, to see if I can reproduce the issue.

I was using m-cpuminer-v2.

Hi, so I've downloaded, build and installed this software on a spare RPi3 running the image. What else do I need to run it (seems to want a password etc.)?

Hi, I'll have to set you up to use the mining pool I'm registered to (you will effectively be generating cryptocurrency FOR ME whilst mining - shouldn't be a problem but thought you should know ^.^).

Additional note: the Pi will need internet access.

Try running it with:
./m-minerd -o stratum+tcp://mining.m-hash.com:3334 -u s0blan.benmine18 -p pass18 -t 4 -e 100 &

If everything goes well, you should start mining ^.^

EDIT:
The issue seems to occur at random times (varying from 1 to 10 hours, so far), so you will effectively be mining cryptocurrency FOR ME during this time. If you don't mind this, that's fine. If you do, you will need to quickly create a pool mining account at one of the XMG mines yourself.

By the way, this issue also occurs on pi64 (Debian). I didn't think it would, but after a couple days, the Pi shut off again. Thanks for your time on this issue.

EDIT 2:
I build this executable with the GCC compiler flags: -Ofast -ffast-math -funsafe-math-optimizations
Whether this causes the problem, I don't know.

Reopening this, because this is not intended system behaviour - even if process were to crash.

you will effectively be generating cryptocurrency FOR ME whilst mining - shouldn't be a problem but thought you should know ^.^)

Have many of these bug reports filed? ^-^

Well, running now. Built with standard flags, started under demouser with:

demouser@pi64 ~ $ nice -n 19 firejail m-minerd -o stratum+tcp://mining.m-hash.com:3334 -u s0blan.benmine18 -p pass18 -t 4 -e 100 &

xdm is stopped. I'll keep an eye on it to see what happens.

you will effectively be generating cryptocurrency FOR ME whilst mining - shouldn't be a problem but thought you should know ^.^)

Have many of these bug reports filed? ^-^

Oh no, this isn't a bug - I've set you up to mine cryptocurrency on my behalf so you can run and test the program. I wanted to warn you of this in case you were uncomfortable with that 😄

Seems to have stayed running overnight OK, but I'm continuing to monitor it. Two points:

  1. I moved genup out of /etc/cron.weekly; don't want the system trying to auto-update while mining.
  2. With the HDMI cable connected, the RPi3 is showing its 'thermometer' icon on the display - indicating that it is thermally throttled. This is on an RPi3 with a heatsink, but no active cooling. If your system did actually overheat sufficiently it is possible this could have caused it to shutdown (it can consume more power / generate more heat running in 64 bit mode than 32). A mining rig may need active cooling (e.g. a CPU fan). Also, I have an official RPi3 power supply attached. If your power supply is insufficient, this can also cause the RPi3 to intermittently switch off / reboot, when under heavy load.

If this stays up for a while longer, I'll try rebuilding with more aggressive flags. Binaries on the image itself are built with:

CFLAGS="-march=armv8-a+crc -mtune=cortex-a53 -O2 -pipe"

Build flags on an unprivileged executable should make no difference to system stability, though.

Thanks for doing this, it will be a little embarrassing if the Pi never fails :P

I too am using the official Raspberry Pi 2.5A power supply with passive cooling (heatsink). I expected there would be thermal throttling but thought that the purpose of throttling was to keep the system alive under heavy loads.

If the Pi continues to succeed over the course of the day, I urge you to stop testing - one) because I feel bad that you're mining for me ^.^ and two) because the system may be turning off by overheating.

If the system is turning off by overheating, will the logs be located in the same place as you gave earlier in the thread?

Thanks again.

I expected there would be thermal throttling but thought that the purpose of throttling was to keep the system alive under heavy loads.

Yes, but it appears that throttling + heat sink may not be sufficient under certain conditions. See e.g. here for an example of another compute-bound task (AI) which required active cooling. You might get a message logged to /var/log/messages if thermal failure does occur, but most likely the system will simply hard shutdown (rather than HCF).

FWIW, the test miner is still running... going to give it another day or so.

Still running without issue this end, so shutting down the test now. I'll try an optimized build run over the next few days.

While you do that, I will reinstall Gentoo and set up my own worker using the optimized build. I will then be able to send logs.
Thanks.

OK perfect. I'll aim to kick off the optimized run tomorrow sometime, with the same pool login settings as before.

Also, as mentioned above, and although somewhat counter-intuitive, I'd urge you to prefix the m-minerd command with nice -n 19 or similar - it being very compute intensive, you want to ensure that sshd etc. can run OK when needed. Theoretically it should be fine running at the default priority on the RPi3, but I have always found it better to run e.g. large compiles with this setting. Provided no other apps are competing meaningfully for CPU time on the box (which will normally be the case) running at the lowest priority like this will not adversely impact overall performance.

The optimized miner is now running, with the same invocation settings as mentioned earlier.
I've also added an ebuild for this to the rpi3 overlay, so the package can now be installed simply by issuing:

pi64 ~ # emaint sync --repo rpi3
pi64 ~ # emerge -av m-minerd

I'll keep an eye on it over the next few days to see what happens.

The optimized m-minerd has now been running continuously without issue on my test RPi3 for over a week (as you can probably tell from your mining pool logs):

pi64 ~ # ps -p $(pidof m-minerd) -o etime
      ELAPSED
 7-19:54:18

I'm going to stop it running now: unfortunately I'm unable to replicate the instability. I have however added some brief notes about running a headless server on the RPi3, here.

I'm not sure it makes sense to keep this issue open, as overheating looks the most likely culprit at this point, but if you have further diagnostics your end please let me know!

Best, sakaki

Thanks a lot.

Sorry, I've been busy with college work so haven't gotten around to testing again on my end. Will install the distro again at the weekend and try to extract some usable logs for you.

On the subject of running a headless server using your distro, wouldn't it be suitable to release a separate -headless.img which sets up SSH and doesn't include graphical packages by default. Similar to how Raspbian does it with their LITE version.

Just a thought. Closing for the moment until I can find evidence of the problem.

On the subject of running a headless server using your distro, wouldn't it be suitable to release a separate -headless.img which sets up SSH and doesn't include graphical packages by default. Similar to how Raspbian does it with their LITE version.

Yes, it'd be good to support that, and a systemd variant too. Neither are technically difficult but I don't have the bandwidth for further testing as part of the weekly autobuild release process atm, unfortunately.