IBM / ibm-cos-sdk-java

ibm-cos-sdk-java

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Make UserMetaData truly case insensitive

YossiTamari opened this issue · comments

Currently UserMetaData inside ObjectMetaData is implemented as a CASE_INSENSITIVE_ORDER TreeMap, to model the fact that the server treats all metadata names as lowercase. This does not work well enough when two metadata items with the same name except casing are entered (e.g. {"hello":"world", "Hello":"Bye"}. The server throws a 403-SignatureDoesNotMatch error. I assume it is because it only sees one of these values, while the client calculates the signature based on both (or perhaps based on the wrong one).
It is really difficult for the API user to connect the thrown AmazonS3Exception to the root cause in this case.

My suggestion is to make the UserMetaData case insensitive by lower-casing all input names. This is what the server will do anyway, and it will make this issue go away.
Another alternative is to validate the UserMetaData and throw an Exception on the client level when there are conflicting names, with a proper error description.

@YossiTamari I agree that client validation of custom headers is necessary here. I would be more in favor of your second suggestion of throwing an exception when conflicting names are encountered. My concern with the first approach is that where we have the same header with multiple values, choosing one value over the others could lead to more confusion.

@Patrick-Browne I understand your reasoning, just keep in mind that the first option is what is currently happening (unless the server decides to throw an exception, which is not consistent), so it is the more backward-compatible change. It is also what happens if you set the same key twice (case-equal).

Can this issue be closed?

@zhuojc No. There is an agreement that a fix is needed here. It is not 100% sure which of the two fix options is best, but one of them needs to be implemented.

@Patrick-Browne We should pursue the option of returning an error with the appropriate code and description. I agree choosing one value over the other could lead to confusion

There's an additional fix needed for this. The SDK prefixes user metdata with 'x-amz-meta'. If the user adds a custom request header e.g. 'x-amz-meta-mycustomheader' and also adds a user metadata header called 'mycustomheader' this will result in 403-SignatureDoesNotMatch error.

We're targeting a solution for this in release 2.1.3. I propose that we:

  1. Make the UserMetaData case insensitive by lower-casing all input names as suggested by @YossiTamari
  2. Silently ignore user metadata headers that conflict with custom request headers

@YossiTamari after reviewing the issue, it looks like the SDK handles duplicate metadata headers by overwriting the value. This is the true regardless of the text case used for the header. So in any PUT request only one instance of the metadata header will be present. Where we get issues is when a custom request header that matches a metadata header in the same request is also added. This will result in duplicate headers and a 403 SignatureDoesNotMatch error. The best way to prevent this error is to avoid prefixing custom request headers with 'x-amz-meta'. We will update the SDK documentation to reflect this.

@Patrick-Browne My headers were not prefixed with 'x-amz-meta'. Are you sure that the SDK calculates the signature based only on the lower cased one?

@YossiTamari testing with AWS4Signer, the SDK converts the headers to lower case before calculating the signature. The AWS4Signer builds a canonicalized request and handles the headers by calling it's getCanonicalizedHeaderString() method as shown in the code snippet below. I'll need to confirm that the other SDK signers handle headers in a similar way. When you produced the 403 were you using HMAC or IAM authentication ?

`protected String getCanonicalizedHeaderString(SignableRequest<?> request) {
final List sortedHeaders = new ArrayList(request.getHeaders()
.keySet());
Collections.sort(sortedHeaders, String.CASE_INSENSITIVE_ORDER);

    final Map<String, String> requestHeaders = request.getHeaders();
    StringBuilder buffer = new StringBuilder();
    for (String header : sortedHeaders) {
        if (shouldExcludeHeaderFromSigning(header)) {
            continue;
        }
        String key = StringUtils.lowerCase(header);
        String value = requestHeaders.get(header);

        StringUtils.appendCompactedString(buffer, key);
        buffer.append(":");
        if (value != null) {
            StringUtils.appendCompactedString(buffer, value);
        }

        buffer.append("\n");
    }

    return buffer.toString();
}`

The attached file contains a simple test that shows that creating two user metadata keys, hello and HELLO, with different values causes the upload to fail.
S3Test.txt

Note that some constants need to be given values before running.

@YossiTamari thanks for the test code. I think the problem here is that you're replacing ObjectMetadata's default map which is case insensitive with a custom map which is case sensitive.

ObjectMetadata by default will store user metadata in a case insensitive TreeMap. If we add a duplicate header, as shown in the code below, the second put value will overwrite the first put value. Duplicate headers will not be used in the signature calculation and we avoid a SignatureDoesNotMatch error.

ObjectMetadata omd = new ObjectMetadata(); omd.addUserMetadata("hello", "world"); omd.addUserMetadata("HELLO", "work"); s3Client.putObject(bucketName, key, new ByteArrayInputStream(content), omd);

In your test code you're creating your own HashMap which is case sensitive and therefore allows duplicate keys with different case. You're then replacing ObjectMetadata's default map with this new HashMap by using the setUserMetadata() method.

While the SDK allows the user to override the default usermetadata map, the onus is on the user to ensure that their custom map does not include duplicate headers, either by implementing a case insensitive tree map (similar to ObjectMetadata) or filtering the keys.

Since sending a set of metadata is a very common use-case, I personally think this approach is just asking for trouble, and it would be better to drop the setUserMetadata method from the API (or fix it so that it would copy the values to the internal map, using putAll). This is bad API design that exposes the internal implementation of the API to the API user, and basically encourages him/her to break it.
If you leave it as-is, I believe you need to add documentation for this method that says the implementation of the Map must comply with these specific requirements. It is not reasonable to expect the user to know this.

@Patrick-Browne Where are we with this? Can you post an update here please? Thanks!

We've implemented a change to the setUserMetadata() method that copies the contents of the user supplied map to ObjectMetadata's case insensitive treemap. This will ensure backward compatibility and prevent the addition of duplicate headers. The change can be targeted for next release.

@YossiTamari for architectural reasons we are not seeking to make an SDK change to ObjectMetadata at this point. We will update the user documentation to explain how to avoid this error when setting user metadata with a custom map.

Closed #8.