Rclone (Win), >5GB files and Scaleway S3 (STANDARD or GLACIER)

#1

This may help someone…

In short, you can use rclone (https://rclone.org/downloads/) to upload directly the file you want into the new C14 glacier from a windows client, for the same bargain price as the legacy Classic C14, and for filesize above 5GB !

I’m writting this because I’ve spent/lost some time to realize 2 of the archives I was uploaded were > 5GB, more than 1000 part of default 5MB size and causing problems (but no errors). This is maybe because rclone is not part of the supported source ? Yeah I’m a newbie using S3!

Supported open Source Client : AWS CLI / S3cmd / S3FS (https://www.scaleway.com/en/docs/object-storage-feature/)

The upload was indeed looping…
when reaching 4.9GB / 6.0GB … the upload stops for a while, then restart “from scratch” 5.0GB/ 12GB to stall again and loop.
But the size of the archive was not 12GB !

@Scaleway, you may put a single warning line in this online doc to warn for >5GB objects ???
Migrate Object Storage with rclone : https://www.scaleway.com/en/docs/how-to-migrate-object-storage-buckets-with-rclone/
It not more evident here (at least from my POV) : Multipart uploads : https://www.scaleway.com/en/docs/s3-multipart-upload/

I found the correct parameters … as usual in the doc (RTFM ! :-p )
https://rclone.org/s3/#multipart-uploads

So after creating a remote provider using “rclone config” + answering questions (or injecting from scratch the wanted one into C:\Users<your_user>.config\rclone.conf or the file that is used by rclone) :

[S3-glacier-PAR]
type = s3
provider = Other
env_auth = false
access_key_id = <YOUR_ACCESS_KEY>
secret_access_key = <YOUR_SECRET_ACCESS_KEY>
endpoint = https://s3.fr-par.scw.cloud
acl = private
region = fr-par
location_constraint = fr-par
storage_class = GLACIER

you could simply launch :
rclone copy -v -P --s3-chunk-size=20M D:\PATH_TO_YOUR_FILES\2016.7z S3-glacier-PAR:<A_BUCKET>/2016

This will upload file 2016.7z in to the “folder” 2016 of the bucket <A_BUCKET> of the provider described by “S3-glacier-PAR”
The interesting option is –s3-chunk-size=20M, that increase the default chunk size from 5MB to 20MB allowing to upload filesize up to 20GB (1000x20MB) instead of 5GB.
If you need more, increase chunk size.

Also notice the
storage_class = GLACIER
into the remote provider description. This avoids to add
--s3-storage-class=GLACIER
into the rclone command line… (unless you want to have your object in STANDARD)

I’m waiting to see why the size of the bucket is TWICE of what I’ve uploaded…
Is it because most of the files have spend an hour or so in STANDARD before being transfered to GLACIER ??? Or because useless 10’s of GB where uploaded for nothing for the >5GB files ???

Will see in the bill of July how I will be charged, and if their followed their add for free migration cost from C14 classic …

#2

Hello @tykiki,

Thank you for your feedback. I will update our documentations and add the information. Thank you again for sharing this.

Benedikt

#3

Hi @bene,

I’ve done a few tests since.
I agree with https://community.scaleway.com/t/review-should-you-use-scaleway-s3-object-storage-no/8575 and the empty bucket with occupied size. After uploading 2 times a 2.9GB files and removing it, I get an empty bucket of 63 MB (42 MB after first deletion). Not a big deal… that’s not GB. But that’s a point.

But follow an interesting case you may really investigate : a stopped transfer doesn’t freed the uploaded size so far !
Upload the same 2.9GB file, but I stopped/cancel upload after about 75%. 0 files but the uploaded amount is considered in the bucket. 2.3GB ! :grimacing:. I’ve read that S3 compatible tools could send back again corrupted part during transfer to avoid full resent. But it’s not the case here.

I’ve restarted the transfer… maybe it could be resumed after all ? …Nope. File does not exist (HTTP HEAD received 404), so re-up again from scratch.

Uploading the doc is a start… but correcting such issues is really a must.

People may understand they are problems for “new” functionalities, but they loose confidence when nothing is corrected for months…

Will see…

#4

Yes, this is a common problem.

I contributed to rclone a max_upload_parts setting: https://github.com/rclone/rclone/pull/4316.
This got later extended by community to also create Scaleway provider that autoconfigures all of that.

The more worrying is really that someone also asked support to update Scaleway docs to reflect that change, but this was response: https://github.com/rclone/rclone/issues/4159#issuecomment-642159437.