feat: Completing Multipart Uploads with Checksum That Contains Number of Parts Fails
jakub300 opened this issue · comments
Expected Behavior
When completing multipart upload checksum in format {checksum}-{numberOfParts}
should be supported.
Current Behavior
When completing multipart upload checksum in format {checksum}-{numberOfParts}
, InvalidArgument: Invalid arguments provided for checksum-bug/minio-checksum-bug/2024-05-04T17:52:34.793Z.mp4: (invalid/unknown checksum sent: invalid checksum)
error is returned.
Various scenarios:
- both s3 and minio allow completing without checksum
- both s3 and minio fail on invalid checkum
- both s3 and minio allow completing with checksum that does not contain number of parts, eg.
ejt070PMd2pF/mFNTgP6LDtAo2px3L+i/l91VTgvRSc=
. - only s3 allows completing when checkusm with number of parts as a suffix is provided, eg.
ejt070PMd2pF/mFNTgP6LDtAo2px3L+i/l91VTgvRSc=-2
.
Possible Solution
Fix somewhere in erasure-multipart.go.
Steps to Reproduce (for bugs)
- Create multipart upload with
ChecksumAlgorithm: "SHA256"
(likely all other algorithms are also impacted) - Upload parts with checksum
- Complete multipart upload with checksum combined with number of parts
Sample code (JS)
// tested with node@20.12.2, minio version RELEASE.2024-05-01T01-11-10Z
// save as .mjs file
// set flowing env variables before starting or here:
// process.env.S3_BUCKET_NAME = "";
// process.env.AWS_REGION = ""; // only for s3
// process.env.AWS_ACCESS_KEY_ID = "";
// process.env.AWS_SECRET_ACCESS_KEY = "";
// process.env.AWS_ENDPOINT_URL = ""; // only for minio
import {
CompleteMultipartUploadCommand,
CreateMultipartUploadCommand,
ListPartsCommand,
S3Client,
UploadPartCommand,
} from "@aws-sdk/client-s3"; // tested with version 3.569.0
import crypto from "node:crypto";
function getChecksum(buffer) {
const hash = crypto.createHash("sha256");
hash.update(new Uint8Array(buffer));
return hash.digest();
}
const video = await fetch(
"https://github.com/minio/minio/assets/610941/2653d680-2c87-42ea-98d0-8feab199d3ef"
).then((response) => response.arrayBuffer());
const s3 = new S3Client({
forcePathStyle: !!process.env.AWS_ENDPOINT_URL,
});
const MB_5 = 5 * 1024 * 1024;
const KEY = `minio-checksum-bug/${new Date().toISOString()}.mp4`;
const PARTS = Math.ceil(video.byteLength / MB_5);
console.log({ KEY, PARTS });
const multipartUpload = await s3.send(
new CreateMultipartUploadCommand({
Bucket: process.env.S3_BUCKET_NAME,
Key: KEY,
ChecksumAlgorithm: "SHA256",
})
);
const uploadId = multipartUpload.UploadId;
const checksums = [];
for (let i = 0; i < PARTS; i++) {
const start = i * MB_5;
const end = Math.min((i + 1) * MB_5, video.byteLength);
const body = video.slice(start, end);
const checksum = getChecksum(body);
const checksumBase64 = checksum.toString("base64");
checksums.push(checksum);
console.log({ i, checksumBase64 });
await s3.send(
new UploadPartCommand({
Bucket: process.env.S3_BUCKET_NAME,
Key: KEY,
UploadId: uploadId,
PartNumber: i + 1,
Body: body,
ChecksumSHA256: checksum.toString("base64"),
})
);
}
const partsList = await s3.send(
new ListPartsCommand({
Bucket: process.env.S3_BUCKET_NAME,
Key: KEY,
UploadId: uploadId,
})
);
const checksumTotal = getChecksum(Buffer.concat(checksums));
const checksumTotalBase64 = checksumTotal.toString("base64");
console.log({ checksumTotalBase64 });
await s3.send(
new CompleteMultipartUploadCommand({
Bucket: process.env.S3_BUCKET_NAME,
Key: KEY,
UploadId: uploadId,
MultipartUpload: {
Parts: partsList.Parts.map((part) => ({
PartNumber: part.PartNumber,
ETag: part.ETag,
ChecksumSHA256: part.ChecksumSHA256,
})),
},
ChecksumSHA256: "a" + checksumTotalBase64, // GOOD does not work in both s3 and minio
ChecksumSHA256: checksumTotalBase64, // GOOD works in both s3 and minio
ChecksumSHA256: `${checksumTotalBase64}-${PARTS}`, // BAD works in s3, does not work in minio
})
);
Test file storage
bbb_sunflower_1080p_30fps_normal_cut_small.mp4
Context
Not a big deal for us at this point, we can remove part numbers from the checksum, but reporting this as a it is compatibility issue.
Regression
Probably not.
Your Environment
Testing locally on Windows WSL, not really relevant for this issue.
minio version RELEASE.2024-05-01T01-11-10Z (commit-id=7926401cbd5cceaacd9509f2e50e1f7d636c2eb8)
Runtime: go1.21.9 linux/amd64
License: GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>
Copyright: 2015-2024 MinIO, Inc.
Linux KUBA-PC 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Yeah. Seems like S3 added the part count later.
#19680 added.