Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filestore Improvements #108

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Filestore Improvements #108

wants to merge 4 commits into from

Conversation

mitsuhiko
Copy link
Member

@mitsuhiko mitsuhiko commented Jul 13, 2023

This is a meta RFC to cover some of the potential improvements to our filestore system.

Rendered RFC

@mitsuhiko mitsuhiko changed the title Filestore new Filestore Improvements Jul 13, 2023
text/0108-filestore-new.md Outdated Show resolved Hide resolved
text/0108-filestore-new.md Outdated Show resolved Hide resolved
text/0108-filestore-new.md Outdated Show resolved Hide resolved
text/0108-filestore-new.md Outdated Show resolved Hide resolved

```python
class FileBlob2(Model):
organization_id = BoundedBigIntegerField(db_index=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use files outside of organization contexts in control silo. Currently we've cloned the File model relations into control silo models so that we would have similar storage/interfaces for user & sentry app avatars.

Would you want to align file usage in control silo as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User avatars are indeed an interesting problem. Do we have different limits on non-debug-files?
As in: A debug-file can be up to 2G right now, and it is internally chunked.

Can we get away with using a different model for avatars altogether? I would argue they are a lot smaller (limited to 1M maybe?), and it does not make sense to chunk those.

@Swatinem
Copy link
Member

So if I read this correctly, you would still want to chunk files, but not saving those chunks deduplicated, thus avoiding the atomic reference counting problem?

How would you manage the migration from the old system to the new one? Will there ever be a cut-off date at which you can just hard-drop the old tables and GCS storage?


As this whole blob-related discussion started off with my discovery of a race-condition between blob upload and blob deletion, would this be solved by splitting off the staging area for uploads as you suggested from the long term storage?

As a reminder, the race condition is actually two separate TOCTOU (Time-of-Check-to-Time-of-Use) problems:

  1. Before even uploading, sentry-cli asks the backend server which chunks are missing based on the chunk-hash.
    Between this check and the final file assembly, the blob can be deleted, failing the assemble.
  2. When assembling the final File, it first queries all the blobs based on their chunk-hash.
    Between this check, and actually inserting a reference into the BlobIndex table, the blob is being deleted, failing the assemble.

I believe the first problem can be solved by a dedicated per-org staging area, one that will refresh a chunks TTL on query by chunk-hash.

The second problem can be either solved by not storing blobs deduplicated, like suggested.
Or I believe an epoch-based reclamation could also be used while still keeping deduplication:

  • Deletion would schedule to delete a chunk for epoch N
  • When assembling, we can use an UPSERT to increment the epoch in the database. (time of check)
  • In between, the deletion job would delete records with a matching epoch N, but deletion would not do anything as the epoch was already bumped to N+1
  • In the next assemble step, when creating BlobIndex entries, the blob is still there, yay (time of use)

Not sure if that complexity would be worth it, or we can just store duplicated blobs.

Deletions would be trivial, and also possible for older files and blobs correctly if we do not have concurrent writes and deletes.

@mitsuhiko
Copy link
Member Author

So if I read this correctly, you would still want to chunk files, but not saving those chunks deduplicated, thus avoiding the atomic reference counting problem?

I don't know. I think I would allow chunking as part of the system but I would force the chunk to be associated with the offset. But honestly for most of the stuff we probably want to do, one huge chunk for the entirety of the file is probably preferable in practical terms.

@Swatinem
Copy link
Member

Can we get a histogram of chunk reuse reasonably? I would love to have some real data on how that reuse looks like.
Maybe the "empty chunk" is potentially shared a ton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants