HDT3213 / rdb

Golang implemented Redis RDB parser for secondary development and memory analysis

Home Page:https://www.cnblogs.com/Finley/p/16251360.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Possible to try and skip malformed entries?

ptaoussanis opened this issue · comments

Hi, and thank you so much for this wonderful tool!

I have a question-

I have an RDB dump file that has at least one entry that seems to be invalid.
redis-check-rdb dump.rdb shows an error:

[offset 0] Checking RDB file dump.rdb
[offset 26] AUX FIELD redis-ver = '5.0.8'
[offset 40] AUX FIELD redis-bits = '64'
[offset 52] AUX FIELD ctime = '1681088401'
[offset 67] AUX FIELD used-mem = '1359262528'
[offset 83] AUX FIELD aof-preamble = '0'
[offset 85] Selecting DB ID 0
--- RDB ERROR DETECTED ---
[offset 51432] Internal error in RDB reading offset 0, function at rdb.c:2080 -> Ziplist integrity check failed.
[additional info] While doing: read-object-value
[additional info] Reading key 'badkey-x'
[additional info] Reading type 14 (quicklist)
[info] 87 keys read
[info] 1 expires
[info] 0 already expired
46161:C 10 Apr 2023 11:42:54.008 # Terminating server after rdb file reading failure.

I'm hoping to try to use or modify your tool to see if it might be possible to skip this badkey-x, and try to save as much of the rest of the data in the rdb file as possible.

When I try ./rdb -c aof -o dump.aof dump.rdb I get the following error:
error: panic: runtime error: slice bounds out of range [:10] with capacity 0

I'm currently digging into the source code to try better understand how parsing works and if it might be possible to try skip over a broken entry.

In case you have any advice on if this might be possible, or how to do it - I'd really appreciate your guidance.

Thank you again!

commented

First of All, thank you for your support.

Based on the information I have obtained so far, it seems that a ziplist in the quicklist is broken. You can confirm this by checking the stacktrace when the panic occurred.

If that's the case, we are very lucky because the ziplist in the RDB file is wrapped in a string. (See: readZipList). You could modify the readZipList function to ignore errors and panics encountered after dec.readString(), so that only the corrupted ziplist will be discarded and other data will be preserved.

Anyway, please check the stacktrace first to confirm if readZipList panicked.

@HDT3213 Hi finley, thank you so much for the quick and helpful reply.

It does indeed look like readZipList is throwing:

goroutine 1 [running]:
runtime/debug.Stack()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/debug/stack.go:24 +0x64
runtime/debug.PrintStack()
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/debug/stack.go:16 +0x1c
github.com/hdt3213/rdb/core.(*Decoder).Parse.func1()
	/Users/ptaoussanis/Repos/misc/rdb/core/decoder.go:391 +0x3c
panic({0x1024cc7c0, 0x140001f35c0})
	/opt/homebrew/Cellar/go/1.20.3/libexec/src/runtime/panic.go:884 +0x204
github.com/hdt3213/rdb/core.readZipListLength(...)
	/Users/ptaoussanis/Repos/misc/rdb/core/utils.go:40
github.com/hdt3213/rdb/core.(*Decoder).readZipList(0x14000075110?)
	/Users/ptaoussanis/Repos/misc/rdb/core/ziplist.go:17 +0x1b4
github.com/hdt3213/rdb/core.(*Decoder).readQuickList(0x102496be0?)
	/Users/ptaoussanis/Repos/misc/rdb/core/list.go:49 +0xcc
github.com/hdt3213/rdb/core.(*Decoder).readObject(0x14000075110?, 0xe, 0x1400054cb40)
	/Users/ptaoussanis/Repos/misc/rdb/core/decoder.go:182 +0x8cc
github.com/hdt3213/rdb/core.(*Decoder).parse(0x14000075110, 0x1400000c198)
	/Users/ptaoussanis/Repos/misc/rdb/core/decoder.go:372 +0x280
github.com/hdt3213/rdb/core.(*Decoder).Parse(0x1024ec9b8?, 0x14000075110?)
	/Users/ptaoussanis/Repos/misc/rdb/core/decoder.go:399 +0x94
github.com/hdt3213/rdb/helper.ToAOF({0x16dd73a91, 0x8}, {0x16dd73a88, 0x8}, {0x14000115f38, 0x1, 0x1})
	/Users/ptaoussanis/Repos/misc/rdb/helper/converter.go:105 +0x450
main.main()
	/Users/ptaoussanis/Repos/misc/rdb/cmd.go:91 +0x580
error: panic: runtime error: slice bounds out of range [:10] with capacity 0

I'll try modify the readZipList function now and report back 👍

commented

Hope it's just some bits reversed, it would be very troublesome if some bytes were mixed in or some were missing

I believe it worked!

This was the precise line that was throwing.

I added before that:

if len(buf) < 10 {
	println("skipping ZipList")
	return nil, nil
}

Which allowed ./rdb -c aof -o dump.aof dump.rdb to successfully complete.

The resulting .aof file could be successfully loaded into Redis.
I'm still trying to investigate exactly how much data needed to be dropped, but this is already a much better outcome than losing the whole dump.

Again, thank you so much for your very quick help on this - and for the amazing tool 🙏

EDIT to add: confirmed only 1 ZipList needed skipping, the luckiest possible outcome 👍

commented

I am glad to help you solve the problem