Disclosure: I work on Google Cloud. Cool work! I love seeing people pushing dist...

jamesblonde · on Nov 22, 2020

>It's somewhat unclear to me, but I think the combination of these statements means "S3 is always treated as a block store, but sometimes the File == Variably-Sized-Block == Object. Is that right?

If the file is "small" (under a configure size, typically 128KB), it is stored in the metadata-layer, not on S3. Otherwise, if you just write the file once in one session (and it is under the 5TB object size limit in S3), there will be one object in S3 (variable size - blocks in HDFS are by default fixed size). However, if you append to the file, then we add a new object (as a block) for the append.

We have a new version under development (working prototype) where we can (in the background) rewrite all the blocks in a single file as a single object, and make the object readable by a S3 API. It will be released some time next year. The idea is that you can mark directories as "S3 compatible" and only pay for rebalancing those ones as needed. But you then have the choice of doing the rebalancing on-demand or as a background task, and prioritizing, and so on. You know the tradeoffs. Yes, it would be easier to do this with GCS. But we did AWS and Azure first, as we feel GCS is more hostile to third-party vendors. The talks we have given at google (to the colossus team a couple of years ago and to Google Cloud/AI - https://www.meetup.com/SF-Big-Analytics/discussions/57666504... ) are like black holes of information transfer.

boulos · on Nov 22, 2020

Your upcoming flexibility sounds awesome. I assume many people would just mark the entire bucket as “compatible” to support arbitrary renames/mv of directories, but being able to say “keep this directory in compat mode” for people who use a single mega bucket and split into teams / datasets at the top will be nice.

I’m sorry if you’ve tried to talk to us and we’ve been unhelpful. I’d be happy to put you in touch with some GCS people specifically — the Colossus folks are multiple layers below, while the AI folks are multiple layers above. They were probably mostly not sure what to say!

We worked quite openly and frankly with the Twitter folks on our GCS connector [1]. I’d be happy to help support doing the same with you. My contact info is in my profile.

(Though I’d definitely agree that we’ve also been surprisingly reticent to talk about Colossus, until recently the only public talk was some slides at FAST).

[1] https://cloud.google.com/blog/products/data-analytics/new-re...

daviesliu · on Nov 22, 2020

> interop is too valuable

Good point, JuiceFS already provides this "transparent mode", which is called compatible mode.