S3 PUT: Setting Content-Length can cause `HttpException`
mgroth0 opened this issue · comments
Describe the bug
Setting the Content-Length in an S3 putObject request can sometimes cause an HttpException. I have narrowed it further. In the test below, only the toByteStream
causes the bug.
Expected behavior
-
The error should be more informative. It is confusing and hard to diagnose when it says "expected 3 bytes but received 175" when you know for sure you are only sending 3 bytes.
-
Behavior should be more consistent across ByteStream instances. It doesn't make sense that swapping one ByteStream for another would cause a bug like this. Conceptually the ByteStream we provide is just a source of bytes. When swapping which ByteStream we use, we do not expect any functional changes to how the HTTP request is sent. Maybe performance-related changes, but nothing major like this.
-
Setting the content length in our put-object request to be the exact length of the bytes we are providing should never cause an error like this. If somewhere in the pipline bytes are added internally, then that should be handled internally.
Current behavior
None of the above expectations are met.
Steps to Reproduce
import aws.sdk.kotlin.services.s3.S3Client
import aws.sdk.kotlin.services.s3.putObject
import aws.smithy.kotlin.runtime.content.ByteStream
import aws.smithy.kotlin.runtime.content.toByteStream
import aws.smithy.kotlin.runtime.http.HttpErrorCode.SDK_UNKNOWN
import aws.smithy.kotlin.runtime.http.HttpException
import kotlinx.coroutines.flow.flow
import kotlinx.coroutines.test.runTest
import org.junit.jupiter.api.assertThrows
import java.util.concurrent.atomic.AtomicInteger
import kotlin.test.Test
import kotlin.test.assertEquals
class ReplicateContentLengthBug() {
@Test
fun replicateContentLengthBug() {
runTest {
val client =
S3Client.fromEnvironment {
credentialsProvider = // private
}
client.use {
val myBucket = "my-temp-test-bucket-0"
fun myTestKey(number: Int) = "/my-temp-test-key-$number"
val array = byteArrayOf(1, 2, 3)
val arraySize = array.size.toLong()
val counter = AtomicInteger()
suspend fun doPut(
stream: ByteStream,
contentLengthHeader: Long? = null
) {
client.putObject {
bucket = myBucket
key = myTestKey(counter.getAndIncrement())
body = stream
contentLength = contentLengthHeader
}
}
doPut(
stream = ByteStream.fromBytes(array),
contentLengthHeader = null
)
doPut(
stream = flow { emit(array) }.toByteStream(this, contentLength = arraySize),
contentLengthHeader = null
)
doPut(
stream = ByteStream.fromBytes(array),
contentLengthHeader = arraySize
)
val exception =
assertThrows<HttpException> {
doPut(
stream = flow { emit(array) }.toByteStream(this, contentLength = arraySize),
contentLengthHeader = arraySize
)
}
assertEquals(SDK_UNKNOWN, exception.errorCode)
assertEquals("java.net.ProtocolException: expected 3 bytes but received 175", exception.message)
}
}
}
}
Possible Solution
- More informative exceptions
- More consistent behavior when using different bytestreams
- Fully handle content-length related issues (like extra bytes being added)
Context
This problem actually first occured for me a while ago, maybe almost year ago. Then somehow I found a workaround, but it wasn't robust and I didn't fully understand it. Recently, I am not sure if it was an update to the library or changes I made on my end or both, but the issue came back. After a lot of debugging I finally narrowed down on the problem.
I know it likely has something to do with chunked encoding, which I have seen mentioned in recent changelists. I want to emphasize that anything having to do with chunked encoding should be an implementation detail that is abstracted away by this library. I don't think as a user that I should have to worry about it for basic usage like I showed in the test.
AWS Kotlin SDK version used
1.0.73
Platform (JVM/JS/Native)
JVM
Operating System and version
Mac
I'm able to replicate the failure and taking a look at a fix. As a temporary workaround, specify the contentLength
in the toByteStream(...)
call instead of the request object.
client.putObject {
bucket = myBucket
key = myTestKey(counter.getAndIncrement())
body = flow { emit(array) }.toByteStream(this, contentLength = arraySize)
contentLength = null // don't specify contentLength here.
}
Confirming that removing contentLength
from the putObject
configuration is a valid workaround. I have been using that workaround since I created this issue, and have not seen any exception since.
Hi, a fix has been merged and should be available in a few hours, under v1.0.79. Thanks again for your detailed report!
⚠️ COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.