klauspost / reedsolomon

Reed-Solomon Erasure Coding in Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reconstructing data with corrupted shard(s)

Root-man opened this issue · comments

Hello,
I have a question regarding the reconstruction of original data when some of the data shards are corrupted (for example some of the bytes in shard []byte are set to 0).
In this case I can detect that my dataset is invalid by using Verify function.
But then, if I pass the dataset to ReconstructData, it will do nothing, because of this code:

	// Quick check: are all of the shards present?  If so, there's
	// nothing to do.
	numberPresent := 0
	dataPresent := 0
	for i := 0; i < r.Shards; i++ {
		if len(shards[i]) != 0 {
			numberPresent++
			if i < r.DataShards {
				dataPresent++
			}
		}
	}
	if numberPresent == r.Shards || dataOnly && dataPresent == r.DataShards {
		// Cool.  All of the shards data data.  We don't
		// need to do anything.
		return nil
	}

Is it possible to reconstruct the data in this case?

You should not use Verify for integrity check. It is very weak and it cannot tell you which shard is broken. See the "usage" section of the README:

The encoder does not know which parts are invalid, so if data corruption is a likely scenario, you need to implement a hash check for each shard. If a byte has changed in your set, and you don't know which it is, there is no way to reconstruct the data set.

Ok, thank you!