mar10 / wsgidav

A generic and extendable WebDAV server based on WSGI

Home Page:https://wsgidav.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Requesting range off end of file does not return 416 status code

timj opened this issue · comments

Describe the bug

If we request a byte range that is beyond the end of the file the server returns the full content of the file and ignores the byte range.

To Reproduce
Steps to reproduce the behavior:

Run up a server with a small file in it:

wsgidav --host=127.0.0.1 --port=65160 --auth=anonymous --root=`pwd`/tmp

(I have tried cheroot and uvicorn servers)

Use curl to request content:

This works:

$ curl -H 'Range: bytes=10-20' -i http://127.0.0.1:65160/pyproject.toml                                            
HTTP/1.1 206 Partial Content
date: Thu, 06 Apr 2023 18:08:34 GMT
server: uvicorn
Content-Length: 11
Last-Modified: Thu, 06 Apr 2023 17:16:56 GMT
Content-Type: application/octet-stream
Date: Thu, 06 Apr 2023 18:08:34 GMT
ETag: "120114958-1680801416-3302"
Accept-Ranges: bytes
Content-Range: bytes 10-20/3302

Ask for a byte range that crosses the threshold and this works:

$ curl -H 'Range: bytes=3200-3400' -i http://127.0.0.1:65160/pyproject.toml
HTTP/1.1 206 Partial Content
date: Thu, 06 Apr 2023 18:10:21 GMT
server: uvicorn
Content-Length: 102
Last-Modified: Thu, 06 Apr 2023 17:16:56 GMT
Content-Type: application/octet-stream
Date: Thu, 06 Apr 2023 18:10:21 GMT
ETag: "120114958-1680801416-3302"
Accept-Ranges: bytes
Content-Range: bytes 3200-3301/3302

Now ask for a byte range that starts off the end of the file and this fails:

HTTP/1.1 206 Partial Content
date: Thu, 06 Apr 2023 18:09:09 GMT
server: uvicorn
Content-Length: 3302
Last-Modified: Thu, 06 Apr 2023 17:16:56 GMT
Content-Type: application/octet-stream
Date: Thu, 06 Apr 2023 18:09:10 GMT
ETag: "120114958-1680801416-3302"
Accept-Ranges: bytes
Content-Range: bytes 0-3301/3302

[build-system]
requires = ["setuptools", "lsst-versions >= 1.3.0"]
build-backend = "setuptools.build_meta"

...

The entire file is returned as shown by the Content-Range header.

The server is receiving the request properly:

11:09:10.785 - INFO    : 127.0.0.1 - (anonymous) - [2023-04-06 18:09:10] "GET /pyproject.toml" depth=0, range=bytes=10000-10020, elap=0.000sec -> 206 Partial Content
INFO:     127.0.0.1:53418 - "GET /pyproject.toml HTTP/1.1" 206 Partial Content

Expected behavior

The expectation is that this should return status code 416. It should definitely not return the full content of the file.

For example requesting a file from GitHub:

$ curl -H 'Range: bytes=100000-100003' -i https://raw.githubusercontent.com/lsst/resources/main/.github/workflows/build.yaml
HTTP/2 416 
cache-control: max-age=300
content-security-policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
content-type: text/plain; charset=utf-8
etag: "07230c3e6c40a24f5ab21c24344d1937f97cbd5c9bea12318aabbc05a3b3caa7"
strict-transport-security: max-age=31536000
x-content-type-options: nosniff
x-frame-options: deny
x-xss-protection: 1; mode=block
x-github-request-id: 187C:5B36:267AC3:2DD550:642F0968
accept-ranges: bytes
content-range: bytes */3045

Screenshots, Log-Files, Stacktrace

Insert stacktrace if applicable.

If applicable, add screenshots to help explain your problem.
If applicable, add a log file (consider --verbose).

Environment:

WsgiDAV/4.2.0 Python/3.10.10(64 bit) macOS-13.3-arm64-arm-64bit
Python from: .../miniconda/envs/lsst-scipipe-5.1.0/bin/python3.10

Which WSGI server was used (cheroot, ext-wsgiutils, gevent, gunicorn, paste, uvicorn, wsgiref, ...)?

cheroot and uvicorn.

Which WebDAV client was used (MS File Explorer, MS Office, macOS Finder, WinSCP, Windows, file mapping, ...)?

Simple curl.

I did have a quick look at the code and the error occurs in util.obtain_content_ranges. The first match in the loop does correctly determine the first and last byte ranges but does not check to see if firstpos > filesize. Instead it drops out of that block and then tries the second match with reSuffixByteRangeSpecifier.search. This time it ends up doing firstpos = filesize - larger_number and gets a negative answer which it then turns into firstpos = 0 and you end up with the whole file.

I did get the simple case to work by adding a check for firstpos > filesize in the first check block and then immediately returning [], 0 -- what I'm not sure about is what should happen if you ask for multiple ranges and only one of them is unsatisfiable.

Yes, I also looked into it a bit yesterday. As I understand it, invalid Range headers may result in a

  • Response 200, returning the complete file,
    or
  • Response 416 with Content-Range: bytes */FILESIZE

I will try to get the latter working. (Multi-Part responses are currently not implemented btw.)