mflatt / s3-sync

Sync a local filesystem with a S3 bucket

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

failure with "Timeout waiting for 100 continue"

samth opened this issue · comments

The pkg server hung recently because it got this error from s3-sync.

@400000006396e6e52d01eb54 Timeout waiting for 100 continue
@400000006396e6e52d01f324   context...:
@400000006396e6e52d02452c    /home/pkgserver/racket/share/pkgs/http/http/request.rkt:435:0: 100-continue?
@400000006396e6e52d02d5b4    /home/pkgserver/racket/collects/racket/contract/private/arrow-higher-order.rkt:375:33
@400000006396e6e52d03857c    [repeats 1 more time]
@400000006396e6e52d03a0d4    /home/pkgserver/racket/share/pkgs/http/http/request.rkt:765:0: request/redirect/uri
@400000006396e6e52d04315c    /home/pkgserver/racket/collects/racket/contract/private/arrow-higher-order.rkt:375:33
@400000006396e6e52d04d184    /home/pkgserver/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3
@400000006396e6e52d052774    /home/pkgserver/racket/collects/racket/contract/private/arrow-higher-order.rkt:375:33: ...row-higher-order.rkt:375:33
@400000006396e6e52d05ac44    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:775:0: put-file-via-bytes
@400000006396e6e52d063114    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:591:15
@400000006396e6e52d069e74    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:412:27
@400000006396e6e52d07948c s3-sync: error in background task
@400000006396e6e52d07d30c   context...:
@400000006396e6e52d07d6f4    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:381:6: get-task-id
@400000006396e6e52d086394    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:538:6: maybe-upload
@400000006396e6e52d08f034    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:633:12: loop
@400000006396e6e52d097504    [repeats 2 more times]
@400000006396e6e52d09af9c    /home/pkgserver/racket/share/pkgs/s3-sync/main.rkt:134:2
@400000006396e6e52d0a1cfc    /home/pkgserver/pkg-index/official/s3.rkt:13:0: upload-all
@400000006396e6e52d0aa5b4    /home/pkgserver/pkg-index/official/common.rkt:150:0: run!

This looks like an expected error in the case that something goes wrong in the network (and when using multiple jobs). Am I missing something? Or is the intent a feature request to add retries, or change error reporting, or something along those lines?

I reported this for two reasons. One is that I had not ever seen this failure before, so it wasn't clear whether it was something I should expect to have happen in normal use. The other is that it's not clear how to respond/detect this situation. If I get the error "error in background task", should I always retry? Or retry once? Or something else?

I suppose s3-sync could report more about the exception, and the fact that it's raised as exn:fail:network might be helpful. When a remote service is involved, I find that retires are often usual, but it usually doesn't help much to try and retry only on certain kinds of failure.

The fact that it's exn:fail:network is useful; I'll change it to retry on that (and maybe also add retry support to s3-sync).