A clarification in document
stkim1 opened this issue · comments
Hello @Redundancy,
I've recently stumbled upon go-sync
and quite impressed with the underlying implementation and its pairing tests. Test cases easily double ( or nearly as much as triple in some cases ) workload and it's a bit discouraging to see them unnoticed by others. Plus, go-sync
appears to have broader platform coverage as it stays away from <netinet/in.h>
and <arpa/inet.h>
. 👍
A reflection of questions arises after following the code base.
In README, it is pointed that
The ZSync mechanism has the weakness that HTTP1.1 ranged requests are not always well supported by CDN providers and ISP proxies. When issues happen, they're very difficult to respond to correctly in software (if possible at all). Using HTTP 1.0 and fully completed GET requests would be better, if possible.
This is a very appetizing point as it is not unusual to face such issue from low-end hosting services for unbeknown reasons. Looking into DoReqeust
of HTTPBlockSource
where its header is composed, however, we can see the operation requests HTTP 1.1
and Range
specifically. I thought there would be a fallback HTTP 1.0 measure to encounter a possible failure, but it was no avail.
I might not have a comprehensive understanding of how go-sync
is built. If you could point me a direction where I should turn my head, it would be very much appreciated.
Thank you very much for this charming work!
You're right - the existing implementation does do something that I explicitly warn against. Note that hypothetically speaking, including cache-busting headers in the responses can help.
The main reason that I don't do it is that it's quite prescriptive to decide to use HTTP1.0 and come up with an appropriate scheme for chopping up the payload. The ideal situation would often be to match the granularity to your units of change (as described a bit lower down).
However, at least theoretically, it should not be too difficult to provide a different implementation of HttpRequester that could source block ranges from potentially multiple files, and fallback to downloading and caching an entire file if a ranged request failed. The work that would be required would largely be to decide how to chop up the input, lay that out as URIs and if that should be handled as a dynamic concern by a webserver or a static one (say, S3).
In order to achieve this, it would probably be best to embed the Verifier into the requester so that it can handle and respond to more cases internally (See BlockSourceBase).
The intent is that the use of Go's interfaces allows delegation of responsibility - the HttpRequester and BlockSourceBase should be almost entirely pluggable, and how you would get the information about the file splits could be an external concern.
I appreciate your input. I've been digging and starting to draw diagrams to picture how the code is designed. It seems, though, writing test cases against possible failures, especially related to http connections, could render a feasible leverage to catch up the design in time and build the HTTP 1.0 fallback measure.
Coincidentally, you've stated that Network Error handling
is one of the improvements you've sought for, and it seems to be the top priority to production readiness mentioned in #15.
I have had very pleasant experience from network simulator such as Network Link Conditioner, albeit only on OSX. I don't believe the very utility could be deployed on such test cases, but the notion of having artificial conditioner to stimulate network issues should stand valid for some degree.
In search of similar tools, I've encountered tylertreat/comcast and Shopify/toxiproxy.
Both projects are highly oriented toward CLI usage, but I think it should be possible to incorporate one of them into tests within affordable time window. I'll start building test cases, and make PRs. I hope you can save some time to review additional tests, and guide me to cover corners properly.