scality / Arsenal

Common utilities for the open-source Scality S3 project components

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Listing Implementation Issue

vrancurel opened this issue · comments

versions of the product: all
affects: bucketfile and bucketclient backend

We should be able to list with both a 'prefix' and a 'marker' while having a delimiter.

Get now, _getStartIndex() masks the prefix if you have a marker.

E.g.

?prefix=X11/&marker=X11%2FResConfigP.h

gives for file names

Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  3.6 kB         ResourceI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  2.9 kB         SM/SM.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  11.0 kB        SM/SMlib.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  4.7 kB         SM/SMproto.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  5.1 kB         SelectionI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  17.0 kB        Shell.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  0.2 kB         ShellI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  12.4 kB        ShellP.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  29.7 kB        StringDefs.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  3.9 kB         Sunkeysym.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  4.2 kB         ThreadsI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  16.8 kB        TranslateI.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.3 kB         VarargsI.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.7 kB         Vendor.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  3.5 kB         VendorP.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  19.7 kB        X.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  13.2 kB        XF86keysym.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  30.3 kB        XKBlib.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  3.9 kB         XWDFile.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  4.5 kB         Xalloca.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.9 kB         Xarch.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.5 kB         Xatom.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  3.7 kB         Xauth.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  20.8 kB        Xcms.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.3 kB         Xdefs.h

As we see some CommonPrefixes are listed as files.

The bug is located here: https://github.com/scality/Arsenal/blob/master/lib/algos/list/delimiter.js

It seems there are other inconsistencies in the code.

Please provide consistent functional (for bucketfile) and end-to-end (for metadata) test scenarios.

As there is no "delimiter" parameter in the query string, I do not see what's wrong in the answer here. Is seems expected that we'd list SM/* since no delimiter was specified.
Did we misunderstand something ?

Anyhow, Michael found a sample request on the amazon doc that does not seem to work with our implementation. He'll be fixing that first.

Just to be sure, I'll take an example straight from the documentation:
For this example, we assume that we have the following keys in our bucket:

  • sample.jpg
  • photos/2006/January/sample.jpg
  • photos/2006/February/sample2.jpg
  • photos/2006/February/sample3.jpg
  • photos/2006/February/sample4.jpg

The following GET request specifies the delimiter parameter with the value /, and the prefix parameter with the value photos/2006/.

GET /?prefix=photos/2006/&delimiter=/ HTTP/1.1
Host: example-bucket.s3.amazonaws.com
Date: Wed, 01 Mar  2006 12:00:00 GMT
Authorization: authorization string

In response, Amazon S3 returns only the keys that start with the specified prefix. Further, it uses the delimiter character to group keys that contain the same substring until the first occurrence of the delimiter character after the specified prefix. For each such key group Amazon S3 returns one <CommonPrefixes> element in the response. The keys grouped under this CommonPrefixes element are not returned elsewhere in the response. The value returned in the CommonPrefixes element is a substring from the beginning of the key to the first occurrence of the specified delimiter after the prefix. Which means we should get that:

<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Name>example-bucket</Name>
  <Prefix>photos/2006/</Prefix>
  <Marker></Marker>
  <MaxKeys>1000</MaxKeys>
  <Delimiter>/</Delimiter>
  <IsTruncated>false</IsTruncated>

  <CommonPrefixes>
    <Prefix>photos/2006/February/</Prefix>
  </CommonPrefixes>
  <CommonPrefixes>
    <Prefix>photos/2006/January/</Prefix>
  </CommonPrefixes>
</ListBucketResult>

It is not the case at the moment (at least when using the memory backend).

Note that we observed that the S3 memory backend does not seem to make use
of the listing algorithms, meaning an increased risk of missing mistakes,
and creating out-of-sync situations between S3 and Arsenal.

On Tue, Aug 30, 2016 at 3:54 PM, Michael Zapata notifications@github.com
wrote:

Just to be sure, I'll take an example straight from the documentation:
For this example, we assume that we have the following keys in our bucket:

  • sample.jpg
  • photos/2006/January/sample.jpg
  • photos/2006/February/sample2.jpg
  • photos/2006/February/sample3.jpg
  • photos/2006/February/sample4.jpg

The following GET request specifies the delimiter parameter with the
value /, and the prefix parameter with the value photos/2006/.

GET /?prefix=photos/2006/&delimiter=/ HTTP/1.1Host: example-bucket.s3.amazonaws.comDate: Wed, 01 Mar 2006 12:00:00 GMTAuthorization: authorization string

In response, Amazon S3 returns only the keys that start with the specified
prefix. Further, it uses the delimiter character to group keys that
contain the same substring until the first occurrence of the delimiter
character after the specified prefix. For each such key group Amazon S3
returns one element in the response. The keys grouped
under this CommonPrefixes element are not returned elsewhere in the
response. The value returned in the CommonPrefixes element is a
substring from the beginning of the key to the first occurrence of the
specified delimiter after the prefix. Which means we should get that:

example-bucket photos/2006/ 1000 / false photos/2006/February/ photos/2006/January/

It is not the case at the moment.


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#147 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ANpZZ9lwR2ZajGQ_Dxdr3wBG7xfaxuqpks5qlDYqgaJpZM4JwAzC
.

David Pineau
Scality R&D Engineer

http://bit.ly/2aKbaTu

There is a preliminary PR taking care of one bug, could we have more context about the issue at hand if this one doesn't handle that issue?