ByteStream Extension fun does not work as expected
lauzadis opened this issue · comments
I tried to replace my codes using FLow<ByteArray>.toByteStream
in my codes, but encountered error like this.
The Github actions stack: https://github.com/hantsy/aws-sdk-kotlin-spring-example/actions/runs/6939132384/job/18875971395?pr=6#step:6:79
The original code caused this issue:
suspend fun S3Client.store(bucketName: String, resourceKey: String, data: Flux<DataBuffer>) {
this.createBucketIfNotExists(bucketName)
val mediaType = MediaTypeFactory.getMediaType(resourceKey)
.orElseGet { MediaType.APPLICATION_OCTET_STREAM }
val byteArrayFlow = data
.map { dataBuffer ->
val bytes = ByteArray(dataBuffer.readableByteCount())
dataBuffer.read(bytes)
DataBufferUtils.release(dataBuffer)
bytes
}
.asFlow()
val request = PutObjectRequest {
bucket = bucketName
body = byteArrayFlow.toByteStream(applicationScope) // here I use toByteStream to transfer the data type.
key = resourceKey
contentType = mediaType.toString()
}
val result = try {
this.putObject(request)
} catch (e: Exception) {
throw S3ClientException(e.message ?: "Failed to store object $resourceKey")
}
println("store object to $bucketName: ${result.eTag}")
}
Hi, I'm able to replicate this. Looking into a potential fix now.
My replication:
val client = S3Client.fromEnvironment {
credentialsProvider = // your credentials here
region = "us-east-1"
}
val byteArrayFlow: Flow<ByteArray> = flowOf("abc".encodeToByteArray(), "def".encodeToByteArray())
client.putObject {
bucket = // your bucket here
key = "playing-with-flows.dat"
body = byteArrayFlow.toByteStream(this@runBlocking)
}
I see the same error Stream must be replayable to calculate a body hash
, which is thrown during signing / canonicalization when the body is not replayable.
I am not sure why it have to accept a coroutinescope as param, and I created a custom scope like this,
private val applicationScope = CoroutineScope(SupervisorJob() + Dispatchers.IO)
It did not work.
I see the same error Stream must be replayable to calculate a body hash, which is thrown during signing / canonicalization when the body is not replayable.
So it can not process the Flow that is a hot stream? such as multipart from Spring reactive.
At this time S3 requires Content-Length
to be set on all requests (see this issue for more explanation). So, to successfully make the request, you need to provide a content length in the call to toByteStream
.
val client = S3Client.fromEnvironment {
credentialsProvider = // your credentials here
region = "us-east-1"
}
val byteArrayFlow: Flow<ByteArray> = flowOf("abc".encodeToByteArray(), "def".encodeToByteArray())
client.putObject {
bucket = // your bucket here
key = "playing-with-flows.dat"
body = byteArrayFlow.toByteStream(this@runBlocking, 6) // must provide content length
}
Got it.
Hope there are a real dynamic Flow improvement that does not requires the content length. Spring reactive does not need it when reading/writing contents to HTTP.
If you're handling an HTTP request and proxying data to S3 you might be able to use the HTTP request Content-Length
header if it's available to you.
It's possible that we can abstract this in the future but it's not a limitation of the SDK but of the underlying S3 PutObject
request.
Closing this as there is no further action to take at this time.
⚠️ COMMENT VISIBILITY WARNING⚠️
Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.
I also stumbled across this issue and it seems that setting the contentLength isn't (always) enough.
In my case I have a:
isOneShot
ByteStream
- Request of known size and known SHA256
- Request is smaller than the chunk size
In this case in AwsHttpSigner
hashSpecification
is set to CalculateFromPayload
as:
contextHashSpecification == null
body != HttpBody.Empty
! body.isEligibleForAwsChunkedStreamin
!isUnsignedPayload
This later leads to the same issue.
It might be that this is really only due to the small object size (~20 KB), as chunking might fix this?
My workaround was the following:
S3Client {
interceptors.add(object : HttpInterceptor {
override suspend fun modifyBeforeSigning(context: ProtocolRequestInterceptorContext<Any, HttpRequest>): HttpRequest {
(context.request as? PutObjectRequest)?.let { putObjectRequest ->
val body = putObjectRequest.body
val sha256 = putObjectRequest.checksumSha256
if (body?.isOneShot == true && sha256 != null) {
val sha256Base64 = Base64.getDecoder().decode(sha256).encodeToHex()
// Set SHA256 for signature calculation from known content SHA256
context.executionContext.attributes[AwsSigningAttributes.HashSpecification] = HashSpecification.Precalculated(sha256Base64)
}
}
return context.protocolRequest
}
})
}
Maybe it would make sense to add this special case to the library?
@felixscheinost can you open a new issue (with a reproduction if possible)? If you have a known content length then it should work with or without chunking and we should probably look at what you're seeing closer.