mc mirror --overwrite should detect changed files
It seems, it currently doesn't
$ mc mb myminio/mybucket
Bucket created successfully `myminio/mybucket`.
$ echo one > testdir/testfile.txt
$ cat testdir/testfile.txt
one
$ mc mirror --overwrite testdir myminio/mybucket
...estfile.txt: 4 B / 4 B ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 227 B/s 0s
$ mc cat myminio/mybucket/testfile.txt
one
$ echo two > testdir/testfile.txt
$ cat testdir/testfile.txt
two
$ mc mirror --overwrite testdir myminio/mybucket
0 B / ? ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0s
$ mc cat myminio/mybucket/testfile.txt
one
mc version RELEASE.2020-01-25T03-02-19Z
Client and Server: Fedora 31 with XFS as filesystem
minio version 2020-01-25T02:50:51Z
With --overwrite and --preserve:
$ mc mb myminio/mybucket
Bucket created successfully `myminio/mybucket`.
$ echo one > testdir/testfile.txt
$ cat testdir/testfile.txt
one
$ mc mirror --overwrite --preserve testdir myminio/mybucket
...estfile.txt: 4 B / 4 B ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 283 B/s 0s
$ mc cat myminio/mybucket/testfile.txt
one
$ echo two > testdir/testfile.txt
$ cat testdir/testfile.txt
two
$ mc mirror --overwrite --preserve testdir myminio/mybucket
0 B / ? ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 0s
$ mc cat myminio/mybucket/testfile.txt
one
@sebschlue this is actually known & expected. mc mirror does not detect changes in a file if its size does not change, like one
& two
has the same length.
@vadmeste What is the limitation that causes this? It seems inconvenient at best.
@vadmeste What is the limitation that causes this? It seems inconvenient at best.
No checksum stored in the server's side (ETag is not equal to the md5sum of the object in some cases)
At Slack channel, some confirmed that it should work when using --preserve
@vadmeste What is the limitation that causes this? It seems inconvenient at best.
No checksum stored in the server's side (ETag is not equal to the md5sum of the object in some cases)
Ouch. That means for snapshotting certain stuff we'd need to rely on rsync.
Is there a way to append/change some harmless metadata which is checked to force this? Or ensure etag is equal to hash?
No checksum stored in the server's side (ETag is not equal to the md5sum of the object in some cases)
Ouch. That means for snapshotting certain stuff we'd need to rely on rsync.
Is there a way to append/change some harmless metadata which is checked to force this? Or ensure etag is equal to hash?
For that use rclone
@seqizz which calculates checksum of entire content - ETag is not md5sum not always see SSE-C, Multipart etc - and md5sum is not reliable many objects out there can simply match the same md5sum - https://www.mscs.dal.ca/~selinger/md5collision/ and its quite common apparently at scale.
Unless of course we can calculate checksum of entire objects using techniques like blake2b - we need to calculate this before uploading the content, slowing this down significantly which you are going to upload.
rsync is meant for local disk to remote disk using delta protocol which reads both ends for checksum this would be unexpected in case of object storage, due to cloud costs.
Ah, of course, I am just free-shooting since currently not bound by "cloud traffic costs" :) I'll check the rclone. Thanks.
Just curious, would it even be possible to add another header like etag but containing hash for minio (on create/modify), without breaking compatibility?
Just curious, would it even be possible to add another header like etag but containing hash for minio (on create/modify), without breaking compatibility?
It is definitely possible @seqizz it is going to be very mc
specific, meaning we have no control over your storage backend anyways, so any state change there wouldn't be properly understood by mc
.
this can lead to double copy etc like issues, it is left away on purpose as we couldn't figure out cost effective way to do it proprely for all generalized usecases.
Can this issue be closed, then?
IMHO this needs to be documented more clearly, preferably in the mirror section of mc documentation directly.
But yeah if this is how minio works, doesn't sound like a bug. π
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.
Most helpful comment
With --overwrite and --preserve: