tuna / tunasync

Mirror job management tool.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

failed when syncing ubuntu repository

r00t1900 opened this issue · comments

Env

  • tunasync: 0.8.0
  • tunasynctl: 0.8.0
  • tunasync-scripts: master@7817785
  • system: debian 10
  • arch: amd64
  • docker: 20.10.9
  • tunathu/bandersnatch: latest

Description

The tunasync worker always outputs errors like the following when mirrorred size reaches approximately about 200GB:
image

Currently I can not figure it out. Here are some related infos:

log dirs

/data/tunasync/log/tunasync/ubuntu-r# ls -l
total 1320
lrwxrwxrwx 1 root root     29 Feb 26 12:33 latest -> ubuntu-r_2022-02-26_12_33.log
-rw-r--r-- 1 root root 132381 Feb 26 02:49 ubuntu-r_2022-02-26_01_49.log.fail
-rw-r--r-- 1 root root 133667 Feb 26 04:05 ubuntu-r_2022-02-26_02_49.log.fail
-rw-r--r-- 1 root root 134114 Feb 26 05:29 ubuntu-r_2022-02-26_04_10.log.fail
-rw-r--r-- 1 root root 139153 Feb 26 06:43 ubuntu-r_2022-02-26_05_29.log.fail
-rw-r--r-- 1 root root 139364 Feb 26 08:00 ubuntu-r_2022-02-26_06_48.log.fail
-rw-r--r-- 1 root root 139250 Feb 26 09:05 ubuntu-r_2022-02-26_08_00.log.fail
-rw-r--r-- 1 root root 142588 Feb 26 10:17 ubuntu-r_2022-02-26_09_10.log.fail
-rw-r--r-- 1 root root 139250 Feb 26 11:21 ubuntu-r_2022-02-26_10_17.log.fail
-rw-r--r-- 1 root root 149223 Feb 26 12:33 ubuntu-r_2022-02-26_11_27.log.fail
-rw-r--r-- 1 root root  85575 Feb 26 12:42 ubuntu-r_2022-02-26_12_33.log

manager.conf

debug = false

[server]
addr = "127.0.0.1"
port = 12345
ssl_cert = ""
ssl_key = ""

[files]
db_type = "bolt"
db_file = "/data/tunasync/manager.db"
ca_cert = ""

worker.conf

[global]
name = "ubuntu-r"
log_dir = "/data/tunasync/log/tunasync/{{.Name}}"
mirror_dir = "/data/tunasync"
concurrent = 10
interval = 1

[manager]
api_base = "http://localhost:12345"
token = ""
ca_cert = ""

[cgroup]
enable = false
base_path = "/sys/fs/cgroup"
group = "tunasync"

[server]
hostname = "localhost"
listen_addr = "127.0.0.1"
listen_port = 6001
ssl_cert = ""
ssl_key = ""

[[mirrors]]
name = "ubuntu-r"
#provider = "two-stage-rsync"
#stage1_profile = "debian"
provider = "rsync"
upstream = "rsync://mirrors.tuna.tsinghua.edu.cn/ubuntu/"
#rsync_options = [ "--delete-excluded" ]
#memory_limit = "1024M"
interval = 5
use_ipv6 = false
  • manager command: tunasync manager -c manager.conf -v
  • worker command: tunasync worker -c worker.conf -v

Appeal

  • Please help me figuring out why would this happen. I would like to make a mirror from tsinghua mirrorred data, which is about 1.7TB. However, in this situation, with so many error occurs, the progress will be deadly slow.
  • I would also like to figure out each params in manager.conf and worker.conf, as I could not find any description in this repository that can explain the meaning of each params. For example, in worker.conf, there are interval params exist in both [global and [[mirror]]. I could not understand the difference between them.
  • Further more, in worker.conf, I don't even know why the section must be set to with double [ and ], is this a feature of Go?

Others

I could not analyze any useful information with them, except this:

grep -rn "error" ubuntu-r_2022-02-26_11_27.log.fail

1540:rsync: [generator] write error: Connection reset by peer (104)
1541:rsync error: error in socket IO (code 10) at io.c(829) [generator=3.1.3]
1542:rsync error: Error in socket I/O
1543:rsync error: received SIGUSR1 (code 19) at main.c(1458) [receiver=3.1.3]

something more about the commented content in worker.conf:

[[mirrors]]
name = "ubuntu-r"
#provider = "two-stage-rsync"
#stage1_profile = "debian"
provider = "rsync"
upstream = "rsync://mirrors.tuna.tsinghua.edu.cn/ubuntu/"
#rsync_options = [ "--delete-excluded" ]
#memory_limit = "1024M"
interval = 5
use_ipv6 = false

I firstly are using the commented configuration, which uses two-stage-rsync as provider. However it came out with the sync error described in this issue, so I try to change the configuration into the debian configuration, just use the rsync provider. But finally they almost came to the same result, mirrorring slowly when the size is above 200GB.

I guess it is because the interval is too small, and your server is banned because of too frequent connections.

I guess it is because the interval is too small, and your server is banned because of too frequent connections.

I guess so too. So would you explain that the meaning of each params in worker.conf ? Since there are 2 interval params and concurrent. And what's your suggestion?

My network speed is approximately about 10 MB/s. Thank you~

I guess so too. So would you explain that the meaning of each params in worker.conf ? Since there are 2 interval params and concurrent. And what's your suggestion?

Please refer to #123 (comment) for explanation.