fcorbelli / zpaqfranz

Deduplicating archiver with encryption and paranoid-level tests. Swiss army knife for the serious backup and disaster recovery manager. Ransomware neutralizer. Win/Linux/Unix

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What causes CRC errors?

ambanmba opened this issue · comments

I've been running some large jobs and just to confirm that I can decompress the contents later I've been doing a test and it keeps throwing these errors. I have recompressed several times and with several combinations of files. Files are on a locally attached fast SSD and compressing back to the same disk.

1 versions, 312.563 files, 149.927.796.416 bytes (139.63 GB)
To be checked 565.037.768.554 in 297.317 files (4 threads)
7.15 stage time 199813.32 no error detected (RAM ~3.22 GB), try CRC-32 (if any)
Checking 683.340 blocks with CRC-32 (554.101.511.479 not-0 bytes)
ERROR: STORED CRC-32 1B9CC417 != DECOMPRESSED 2B3669AD (ck 00017524) /Volumes/WD Black HD/Master.rar
ERROR: STORED CRC-32 3DFEA54F != DECOMPRESSED 07069F7D (ck 00007542) /Volumes/WD Black HD/H_master.rar
ERROR: STORED CRC-32 B96CFBD3 != DECOMPRESSED 07E52C69 (ck 00007645) /Volumes/WD Black HD/H_pix.rar
ERROR: STORED CRC-32 7462DCDA != DECOMPRESSED F442DED8 (ck 00007120) /Volumes/WD Black HD/Outlook.pst
Block 00682K 502.44 GB
CRC-32 time 50.51s
Blocks 554.101.511.479 ( 683.340)
Zeros negative ( 51.977) 11.613000 s
Total 539.586.707.915 speed 10.681.923.979/sec (9.95 GB/s)
ERRORS : 00000004 (ERROR in rebuilded CRC-32, SHA-1 collisions?)

Short version: are you using 58.5+?
Long version: during refactoring a bug is introducted, converting block size to ASCII decimal to be sorted
CRC block can have holes (aka block N+1 does not start at end of (start block N+size of block N)
The holes are... chunk of 0 (zero bytes)
To detect those holes the CRC block must be sorted lexograpically
Sorting require, as stated, a binary to ascii conversion
On "ancient" release this was hardcoded OK, quick and dirty
after refactoring a "universal" function is used

BUT

for my mistake in the source the size limit is 10. Therefore big file (>10GB) will have unsorted blocks and, if some 0 holes, then an error is taken

In very ancient release you get ascii text instead of numbers, in later a NEGATIVE is written (just like yours)
In 58.5+ a bigger lenght is used, therefore you should not get errors

In the souce there aren't checks against not aligned block borders. Just because it cannot happen (with good sorting!)

Please update your software and try again and let me know

Ps this error does not alter anyhow the backups, does not corrupt anything

I'm using zpaqfranz v58.4s-JIT-L(2023-06-23)

just upgraded to zpaqfranz v58.5o-JIT-L(2023-07-12) and will try again and report back. This takes a looooong time on the machine, so it's just running in the background headless.

For a paranoid test you can use the -paranoid switch

This will extact everything and check one file at time

Or you can extract manually and run a verify command (with -ssd for multithreaded test)

I'm doing a paranoid ssd check now... what does 694% mean?

PARANOID TEST: working on file1.zpaq
19/07/2023 10:38:52      114.883.753.569 (     114.883.753.569)
Remaining 694 % frags       51.520 (RAM used ~   3.802.155.694)

Please report the exact command line

This seems a p command (paranoid test) not a t command (test) with -paranoid switch

zpaqfranz -ssd p file1.zpaq

Ok p is a different command
This will use a different source code to extract in RAM the archive, just like unzpaq206 on a monothread (no -ssd)

Switches must go after archive and files, not before

Please follow the examples

zpaqfranz h t

To get help and examples for t command (h means help)

zpaqfranz h v
zpaqfranz h p
zpaqfranz h w

For commands v (verify) p (paranoid) w (chunked verify)

Ok, will let it run and report back... going with:

zpaqfranz t file1.zpaq -all -ssd -t4

This is a normal t, the -ssd does not help in this case
You have to make (for a full extraction test)

zpaqfranz t file.zpaq -paranoid -ssd -to somefolderonassdwithenoughfreespace

A straight
t file.zpaq is more than enough (if you are not paranoid, just like me)

-ssd means "read from filesystem with as much threads as you want". Usually running sum (hash computing) or some verify or -paranoid
That is extract everything in a temp folder, read back every file, check the hash, delete if hash ok, leave if hash different. -ssd on solid drives

If, after the procedure, the temp folder is empty, then this is very good

Otherwise if some files are present, some hashes does not match and this is bad

For example

C:\zpaqfranz>zpaqfranz a z:\pippero *.cpp
zpaqfranz v58.6a-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-07-19)
franz:-hw
Creating z:/pippero.zpaq at offset 0 + 0
Add 2023-07-19 10:47:59        21         69.711.284 (  66.48 MB) 32T (0 dirs)
21 +added, 0 -removed.

0 + (69.711.284 -> 12.078.525 -> 1.686.826) = 1.686.826 @ 118.08 MB/s

0.563 seconds (00:00:00) (all OK)

Now extract everything into temp folder z:\ugo

C:\zpaqfranz>zpaqfranz t z:\pippero.zpaq -to z:\ugo -paranoid 
zpaqfranz v58.6a-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-07-19)
franz:-hw -paranoid -ssd
z:/pippero.zpaq:
1 versions, 21 files, 1.686.826 bytes (1.61 MB)
Extract 69.711.284 bytes (66.48 MB) in 21 files (0 folders) / 32 T
         9.53% 00:00:00  (   6.33 MB)=>(  66.48 MB)    6.33 MB/sec


FULL-extract hashing check (aka:paranoid)

Total bytes                     69.711.284 (should be 69.711.284)
Bytes checked                   69.711.284 (should be 69.711.284)
Files to be checked                     21
Files ==                                21 (should be 21)
Files !=                                 0 (should be zero)
Files deleted                           21 (shoud be 21)

0.235 seconds (00:00:00) (all OK)

Hash verify against filesystem (multithread)

C:\zpaqfranz>zpaqfranz t z:\pippero.zpaq -verify -ssd
zpaqfranz v58.6a-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-07-19)
franz:-hw -ssd -verify
z:/pippero.zpaq:
1 versions, 21 files, 1.686.826 bytes (1.61 MB)
To be checked 69.711.284 in 21 files (32 threads)
7.15 stage time       0.11 no error detected (RAM ~514.07 MB), try CRC-32 (if any)
Checking               205 blocks with CRC-32 (69.711.284 not-0 bytes)

CRC-32 time           0.00s
Blocks          69.711.284 (         205)
Zeros                    0 (           0) 0.000000 s
Total           69.711.284 speed 4.100.663.764/sec (3.82 GB/s)
GOOD            : 00000021 of 00000021 (stored=decompressed)
VERDICT         : OK                   (CRC-32 stored vs decompressed)
++++++++++++++++++++++++++++++++++++++
Re-testing (hashing) from filesystem (-verify) if possible

Verify hashes of one version vs filesystem (multithreaded)
Total files 21 -> in 021 threads -> 21 to be checked
----------------------------------------------------------------------------------------------------
OK       XXHASH64 : 00000021 of 00000021 (    66.48 MB hash check against file on disk)
----------------------------------------------------------------------------------------------------

0.250 seconds (00:00:00) (all OK)

Short version: what kind of test do you want to do?

Excellent. That all worked. Thanks.