-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3Cache improperly requires ListBucket permission (incurs higher cost) #609
Comments
Possible Solution: |
@daniel-keller this is still an issue. We have made some inquiries to AWS about this (and waiting for an official response about inner workings) but after reading through all the code here and checking the SDK properly, my current perception is that the issue might be actually related to the walkObjects method on the S3 utils class, used not for We have many Cantaloupes running and this is becoming an issue. Sadly you can not deny the s3:ListBucket permission if you want to have cache expiring at all, even if we code the 403 response around (that line you pointed) the GET part. I would love to explore optimizing (even if as an option) that listing, maybe by keeping a local cache of generated derivative prefixes so the listing is less expensive. We will give a few server without expiration of caches a try for a month to check if that is the case. |
TL;DR:
I don't think Cantaloupe should use or require "ListBucket" AWS S3 permissions to use S3 Cache or S3 Source (List incurs higher AWS cost the Get requests). Without the "ListBucket" permissions Cantaloupe can read images from an S3 Source and read/write manifests to S3 Cache but it breaks when writing images to the S3 Cache. Any insight on why this is?
Evidence of high ListBucket calls when implementing Cantaloupe.
Cantaloupe makes 100K ListBucket API calls but only 800 GetObject API calls.
Long versions:
I recently setup cantaloupe in a docker container on AWS using S3 two buckets as source and cache. The setup works well and I'm happy with the performance. I've noticed an unusually high number of "ListBucket" S3 API requests being made by Cantaloupe that are relatively high cost. I'm wondering why Cantaloupe needs to "ListBucket" since I am providing the bucket name and region for both the cache and the source.
When I disabling ListBucket permissions on my AWS user (keeping other permissions e.g. ReadObject, WriteObject, DeleteObject, etc.), Cantaloupe successful reads images/manifests from the source and writes manifests to the cache but fails to write images to the cache.
Since I can't find "ListBucket" in the Cantaloupe source code I assume it's being used internally by the AWS SDK. And I suspect it has something to do with the "GetObjectRequest" api call made here.
AWS SDK docs for "GetObjectRequest" notes if ListBucket is not permitted "Access Denied" will be thrown instead of "No Such Key".
My theory is this: Cantaloupe queries the cache for the key and, not finding it, receives "Access Denied" but expects to receive "No Such Key". "Access Denied" doesn't necessary imply "no access to the bucket" but could also mean no access to list the bucket.
The text was updated successfully, but these errors were encountered: