lightningnetwork / lnd

Lightning Network Daemon ⚡️

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LND crash on boot

stridentbean opened this issue · comments

Background

We have a casa node user who's lnd instance is crashing on boot. It looks like it is happening even before lnd is unlocked. Meaning, as soon as the lnd container starts running.

We are going to recover the funds for this user via SCB. I'm assuming there is a database corruption somewhere. I wanted to drop the error logs in here on the off chance that it would be helpful.

Logs,

2019-09-03T01:26:19Z lnd UNKNOWN[1340] panic: runtime error: index out of range
2019-09-03T01:26:19Z lnd UNKNOWN[1340] 
2019-09-03T01:26:19Z lnd UNKNOWN[1340] goroutine 1 [running]:
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*leafPageElement).key(...)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/page.go:120
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Cursor).nsearch.func2(0x6926, 0x2842960)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/cursor.go:328 +0xac
2019-09-03T01:26:19Z lnd UNKNOWN[1340] sort.Search(0xd24c, 0x35cd568, 0x57e48)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/usr/local/go/src/sort/search.go:66 +0x54
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Cursor).nsearch(0x35cd6ec, 0x6204c5b0, 0x8, 0x8)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/cursor.go:327 +0xb8
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Cursor).search(0x35cd6ec, 0x6204c5b0, 0x8, 0x8, 0x1d2d, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/cursor.go:257 +0x158
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Cursor).searchPage(0x35cd6ec, 0x6204c5b0, 0x8, 0x8, 0x62c7b000)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/cursor.go:308 +0x118
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Cursor).search(0x35cd6ec, 0x6204c5b0, 0x8, 0x8, 0x4210, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/cursor.go:265 +0x134
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Cursor).seek(0x35cd6ec, 0x6204c5b0, 0x8, 0x8, 0x0, 0x0, 0x76f01368, 0x0, 0x1f9, 0x1f9, ...)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/cursor.go:159 +0x70
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Bucket).Get(0x3005260, 0x6204c5b0, 0x8, 0x8, 0xccb3007, 0x40, 0x46a2a6c1)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/bucket.go:260 +0x98
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd/channeldb.fetchChanEdgeInfo(0x3005260, 0x6204c5b0, 0x8, 0x8, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/channeldb/graph.go:3471 +0x44
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd/channeldb.(*ChannelGraph).ChannelView.func1.1(0x6204c58c, 0x24, 0x24, 0x6204c5b0, 0x8, 0x8, 0x8, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/channeldb/graph.go:3038 +0xdc
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*Bucket).ForEach(0x3005220, 0x35cdc28, 0xa, 0xa)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/bucket.go:388 +0xf8
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd/channeldb.(*ChannelGraph).ChannelView.func1(0x32d2b00, 0xcdc8d8, 0x32d2b00)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/channeldb/graph.go:3029 +0x11c
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/coreos/bbolt.(*DB).View(0x2a56000, 0x35cdc6c, 0x0, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/pkg/mod/github.com/coreos/bbolt@v1.3.2/db.go:719 +0x84
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd/channeldb.(*ChannelGraph).ChannelView(0x2c0aa50, 0x2954680, 0x0, 0x0, 0x0, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/channeldb/graph.go:3009 +0x60
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd/routing.(*ChannelRouter).Start(0x2886930, 0x0, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/routing/router.go:495 +0x464
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd.(*server).Start.func1()
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/server.go:1197 +0x418
2019-09-03T01:26:19Z lnd UNKNOWN[1340] sync.(*Once).Do(0x29ce288, 0x29b2d98)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/usr/local/go/src/sync/once.go:44 +0xb8
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd.(*server).Start(0x29ce280, 0xb933e1, 0x2e)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/server.go:1125 +0x64
2019-09-03T01:26:19Z lnd UNKNOWN[1340] github.com/lightningnetwork/lnd.Main(0x0, 0x0)
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/lnd.go:473 +0xe34
2019-09-03T01:26:19Z lnd UNKNOWN[1340] main.main()
2019-09-03T01:26:19Z lnd UNKNOWN[1340] #011/go/src/github.com/lightningnetwork/lnd/cmd/lnd/main.go:14 +0x14

Your environment

  • lnd 0.7.0
  • Linux casa-node 4.14.70-v7+ #2 SMP Wed Sep 19 07:49:26 UTC 2018 armv7l GNU/Linux
  • bitcoind 0.18.0
  • Running in Docker containers with docker-compose.

@stridentbean was this immediately after updating to 0.7.1? If so, do you know what version it upgraded from? If you have an SCB, then it would seem the user was at least on 0.6?

It does look like db corruption tho, SCB will probably be the best path forward. Will continue to comb through and see if we can gleam anything

Woops. This user has not updated to 0.7.1 yet, they were consistently running 0.7.0.

Originally, bitcoind was corrupted and we had to resync bitcoind data from the cloud. While bitcoind was corrupted lnd was online and spinning on this unable to query estimater as seen below.

 8827: 2019-08-23T09:35:25Z lnd UNKNOWN[3258] 2019-08-23 09:35:24.986 [ERR] LNWL: unable to query estimator: -28: Loading block index...
 8828  2019-08-23T09:35:52Z bitcoind UNKNOWN[30386] : Error opening block database.
 8829  2019-08-23T09:35:52Z bitcoind UNKNOWN[30386] Please restart with -reindex or -reindex-chainstate to recover.

Then we turned off bitcoind and lnd resynced bitcoind to (-1k blocks from head) and restarted lnd. Maybe lnd was corrupted because we started it while bitcoind was behind where it was prior to the chainstate corruption?

Maybe lnd was corrupted because we started it while bitcoind was behind where it was prior to the chainstate corruption

This scenario wouldn't cause this kind of corruption.

Are bitcoind and lnd sharing the same disk? Could've been a disk issue given that both bitcoind and lnd were corrupted.

Are bitcoind and lnd sharing the same disk? Could've been a disk issue given that both bitcoind and lnd were corrupted.

Yes, they are on the same spinning disk HDD. It's certainly possible.

Will go ahead and close this then. Keep in mind that the channels will be closed as part of the recovery process. Feel free to comment/reopen if you run into any issues.