modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Could you provide the md5 value of train.tar.gz-part-{a-f} ?

llearner opened this issue · comments

We downloaded the train.tar.gz-part-{a-f}, but the md5 value of the merged file is wrong. We are not sure which file is the wrong one.

uncompress error message:
tar: Skipping to next header
tar: Archive contains \0\b\0\004\0\006\0\005\0\004\0' where numeric off_t value expected tar: Archive contains \0\005\0\r\0\004\0\004\0\a\0' where numeric time_t value expected
tar: Archive value -1095216594940 is out of uid_t range 0..4294967295
tar: Archive contains `\0\016\0\v\0\t\0' where numeric gid_t value expected
\377\374\377\375\377\002
tar: ▒▒▒▒▒: implausibly old time stamp 1970-01-01 07:59:59
tar: Skipping to next header

gzip: stdin: invalid compressed data--format violated
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Can you provide more details of your downloaded files train.tar.gz-part-{a-f} and your uncompress cmd? Based on the information you've provided so far, I am unable to determine the specific file that is causing the issue.

Can you provide more details of your downloaded files train.tar.gz-part-{a-f} and your uncompress cmd? Based on the information you've provided so far, I am unable to determine the specific file that is causing the issue.

Yes,

  1. cat train.tar.gz-part-* > train.tar.gz
  2. tar -zxvf train.tar.gz
    train/
    train/3D_SPK_00001/
    train/3D_SPK_00001/3D_SPK_00001_001_Device01_Distance04_Dialect00.wav
    ...
    train/3D_SPK_01197/3D_SPK_01197_003_Device05_Distance13_Dialect00.wav
    train/3D_SPK_01197/3D_SPK_01197_003_Device06_Distance08_Dialect00.wav
    train/3D_SPK_01197/3D_SPK_01197_003_Device08_Distance12_Dialect00.wav
    tar: Skipping to next header
    tar: Archive contains \0\b\0\004\0\006\0\005\0\004\0' where numeric off_t value expected tar: Archive contains \0\005\0\r\0\004\0\004\0\a\0' where numeric time_t value expected
    tar: Archive value -1095216594940 is out of uid_t range 0..4294967295
    tar: Archive contains `\0\016\0\v\0\t\0' where numeric gid_t value expected
    \377\374\377\375\377\002
    tar: ▒▒▒▒▒: implausibly old time stamp 1970-01-01 07:59:59
    tar: Skipping to next header

gzip: stdin: invalid compressed data--format violated
tar: Child returned status 1
tar: Error is not recoverable: exiting now

File size / File Name
203207197134 / train.tar.gz
34359738368 / train.tar.gz-part-a
34359738368 / train.tar.gz-part-b
34359738368 / train.tar.gz-part-c
34359738368 / train.tar.gz-part-d
34359738368 / train.tar.gz-part-e
31408505294 / train.tar.gz-part-f
and the md5 value of train.tar.gz is 6e774697a07ae332d51049c418eded85

We have checked the datasets and run cat train.tar.gz-part-* > train.tar.gz and md5sum train.tar.gz. The md5 value is c2cea55fd22a2b867d295fb35a2d3340 which is the same as the value on our website, but different from your results.

There may have been some errors during the download process. We suggest you can try downloading it again as the md5 value is different.

We have checked the datasets and run cat train.tar.gz-part-* > train.tar.gz and md5sum train.tar.gz. The md5 value is c2cea55fd22a2b867d295fb35a2d3340 which is the same as the value on our website, but different from your results.

There may have been some errors during the download process. We suggest you can try downloading it again as the md5 value is different.

Yes, so in order to avoid re-downloading all the parts, it's important to know which part of the file failed, here is md5 value of our downloaded files:
train.tar.gz-part-a: 4109addde41d88760947263f18117ac3
train.tar.gz-part-b: ea569fc26d894f5e0c5e38be2820490f
train.tar.gz-part-c: bd2ce08f5b51005b66afe484b01a4a59
train.tar.gz-part-d: 5cd31d961d2d5211aea38b8b95f7239a
train.tar.gz-part-e: 58f3fb7d28ae7f4b65ee35a1ed7ab106
train.tar.gz-part-f: be64551c030e8087562a10df2c74ccb1

Yes, here're the md5 value of part files:

4109addde41d88760947263f18117ac3  train.tar.gz-part-a
5a17ef2fa28b1b9e340277edffb8b51c  train.tar.gz-part-b
bd2ce08f5b51005b66afe484b01a4a59  train.tar.gz-part-c
5cd31d961d2d5211aea38b8b95f7239a  train.tar.gz-part-d
58f3fb7d28ae7f4b65ee35a1ed7ab106  train.tar.gz-part-e
be64551c030e8087562a10df2c74ccb1  train.tar.gz-part-f

The file train.tar.gz-part-b you downloaded have some problems.

We will update these information in our shell scripts.