Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose queried block IDs from the BucketStore #2479

Closed
pracucci opened this issue Apr 20, 2020 · 13 comments · Fixed by #2516
Closed

Expose queried block IDs from the BucketStore #2479

pracucci opened this issue Apr 20, 2020 · 13 comments · Fixed by #2516
Labels

Comments

@pracucci
Copy link
Contributor

pracucci commented Apr 20, 2020

In Cortex we recently introduced the store-gateway service (proposal) which, similarly to Thanos, sits in front of the bucket and is used to shard blocks across a pool of store-gateway instances (we also support a replication factor for HA in the read path).

The next step on the Cortex side would be introducing a consistency check. Basically, due to the lack of a strong coordination there's no guarantee that the set of store-gateway instances picked by a querier effectively loaded the exact set of blocks the querier expects (ie. ring hash change could be not propagated yet, blocks could still be loading after a resharding, etc). For this reason, we would like the querier to double check the queried blocks through store-gateway which is a piece of information missing right now.

Since the store-gateway internally uses the BucketStore we would need to expose the list of queried block IDs from the Series() API. So far, we're importing Thanos protobuf in Cortex in order to guarantee the Series() API endpoint exposed by Cortex store-gateway is compatible with Thanos, but it's a soft requirement (even if personally desirable).

I personally see a couple of options:

  1. Add info to storepb.SeriesResponse (in addition to series and warnings). The info would contain an entry with the list of queried block IDs. The info response wouldn't be included by default, but enabled though a new boolean flag in SeriesRequest. This option would allow Cortex to keep full API compatibility with Thanos. One downside is that only BucketStore would support it (OK for Cortex, maybe KO for Thanos).
  2. Refactor BucketStore.Series() into BucketStore.SeriesWithInfo(req, srv) (SeriesInfo, error) (not exposed through the protobuf) and having BucketStore.Series() just calling SeriesWithInfo() and ignoring the returned info. This way Cortex store-gateway could directly call BucketStore.SeriesWithInfo() instead of BucketStore.Series().

I've also considered using Info() but there are couple of downsides:

  1. We would like the list of block IDs queried for a specific Series() request and not the entire list of block IDs loaded in the BucketStore
  2. There would be no guarantee that the returned block IDs are the one loaded once Series() was called

Thoughts?

@bwplotka
Copy link
Member

bwplotka commented Apr 21, 2020

Thanks for this, I am curious what others think ( @GiedriusS @brancz ?), but IMO:

First of all, I totally see the need. In fact, Thanos itself can leverage this for block-aware query planning. However, the fact that we plan query planner to be in query-frontend is not helping. (: We might need dynamic sharding as you have as well, so it definitely sounds useful. Not in the near future and we don't know how it might work for us as well.

Secondly, we definitely don't want to block you, it's would be silly if you would have to fork the whole bucket.go work we did together.

Unfortunately, I think it would be better if we could avoid any of those options and even putting this information in Info itself. The reason is that StoreAPI is really implementation agnostic. (as all interfaces should be kind of). Store API should not care how you store metrics. Blocks, bucket, files, bytes in memory, tapes, jars, human memory... it does not matter what's the underlying structure. StoreAPI allows fetching Series, period. It allows and API to expose what labels it exposes and what time period it has data for. I think we should be super focused on maintaining the narrowness of our interface, our API for all costs. This is why we plan to create separate APIs for Rules and Targets. This is what allows to implement StoreAPI for OpenTSDB, InfluxDB or any other backends. The block details should leak within this abstraction.

How to solve it?

I think there are some ways we can actually enable this, so allow Thanos store gateway to give some hints. Since those hints are related to the consistency we could name this thing consistency_hints. We can actually easily pass those hints without leaking this abstraction, between two components.

There is even official way of doing so defined here https://developers.google.com/protocol-buffers/docs/proto3#unknowns

So:

message SeriesResponse {
  oneof result {
    Series series = 1;

    /// warning is considered an information piece in place of series for warning purposes.
    /// It is used to warn query customer about suspicious cases or partial response (if enabled).
    string warning = 2;
    
    // comment....
    google.protobuf.Any consistency_hints = 3;
  }
}

(Not sure if should be part of oneof just some proposal. SeriesResponse is a frame so quite weird for a hint to be on all frames).

With this, we can maintain separate block-aware protobuf type we can version between Store GW and block aware Querier 🤗 This way if we switch to Jars with memory sticks (TSDBv2!) someday, the Jar Gateway can pass hints about jar types 😄 Does it make sense? 🤔

cc @brancz @pracucci (and also @s-urbaniak as we touched APIs a lot recently).

The alternative would be to use metadata. It feels like a nice place for hints. I don't fully like this because we lose type and versioned nature of protobuf, so I feel like Any extension would be better here. (:

@s-urbaniak
Copy link
Contributor

s-urbaniak commented Apr 21, 2020

I really like the idea of having an opaque google.protobuf.Any consistency_hints = 3; entry 👍 I reminds of json.RawMessage in Go which turned out to be a powerful tool to support and dispatch different APIs.

Effectively this would introduce an opaque field, left to be contracted between a concrete implementer and consumer. If dispatching based on the actual underlying type is easy then this is a great way forward.

Just to put in some concerns too: I did see APIs in the past (mostly legacy systems) which included opaque fields and saw two "anti" patterns:

  • Abuse of opaque fields of what they were not meant for: i.e. transport metadata that is actually not a "consistenty hint".
  • Bloating APIs with opaque fields as there is lack of consensus.

Recommendation from my side: Have a crisp and clear definition on the context of a "consistency hint", preferably with an example and (maybe even important here?) with a sane default data structure that could fit the 80%ile use case that implementers/consumers can follow.

@bwplotka
Copy link
Member

Linking discussion we had around possible options https://cloud-native.slack.com/archives/CL25937SP/p1587453842375700 would be nice to sum up our discussion somehow here

@pracucci
Copy link
Contributor Author

We thought about 3 options:

  1. Add consistency_hints to SeriesResponse inside oneof
  2. Add consistency_hints to SeriesResponse as a sibling of oneof
  3. gRPC metadata
    Given the options above, what I ment before is that I don't see much value in (2) over (1).

1. Add consistency_hints to SeriesResponse inside oneof

message SeriesResponse {
  oneof result {
    Series series = 1;

    /// warning is considered an information piece in place of series for warning purposes.
    /// It is used to warn query customer about suspicious cases or partial response (if enabled).
    string warning = 2;

    /// consistency_hints in an opaque data structure containing information that could be used
    /// on the client side to run a consistency check about the queried series. The content of
    /// this field and whether it's supported depends on the implementation of a specific store.
    google.protobuf.Any consistency_hints = 3;
  }
}

This change is backward incompatible in the context of Thanos. An old Thanos querier that's querying a new store may receive a SeriesResponse frame containing the consistency_hints and fail because neither series or warning are set. Both query and proxy suffer this issue, ie.

thanos/pkg/query/querier.go

Lines 121 to 132 in 822bc7c

func (s *seriesServer) Send(r *storepb.SeriesResponse) error {
if r.GetWarning() != "" {
s.warnings = append(s.warnings, r.GetWarning())
return nil
}
if r.GetSeries() == nil {
return errors.New("no seriesSet")
}
s.seriesSet = append(s.seriesSet, *r.GetSeries())
return nil
}

A possible workaround would be adding an option to SeriesRequest to enable consistency hints (disabled by default):

message SeriesRequest {
  // [current request fields]

  bool consistency_hints_enabled = 9;
}

This would have also the advantage to skip any consistency hints generation on the store side if not requested.

2. Add consistency_hints to SeriesResponse as a sibling of oneof

message SeriesResponse {
  oneof result {
    Series series = 1;

    /// warning is considered an information piece in place of series for warning purposes.
    /// It is used to warn query customer about suspicious cases or partial response (if enabled).
    string warning = 2;
  }

  /// consistency_hints in an opaque data structure containing information that could be used
  /// on the client side to run a consistency check about the queried series. The content of
  /// this field and whether it's supported depends on the implementation of a specific store.
  google.protobuf.Any consistency_hints = 3;
}

This change is backward compatible, but open a design issue: given SeriesResponse is a frame and the consistency_hints are not really tied to a frame, which SeriesResponse should contain the consistency_hints? If it's "all responses", then we're waisting resources. If it's "just one response", then which one? A random one?

3. gRPC metadata

Another option would be using gRPC metadata. This would allow us to add hints to the response without bloating the protobuf. We would lose data types and versioning if we use a custom struct, but we could define a protobuf message for BucketStore consistency hints and pass serialized protobuf in the gRPC metadata.

This opens another question: should any Series() response contain the consistency hints or should we allow the client to enable it via a flag? In the latter case, we would end up adding a request/response pattern on top of metadata (the request metadata contains a flag whether consistency hints are enabled, the response contains metadata with the hints), which sounds weird. In such case, the option (1) may be better.

@bwplotka
Copy link
Member

bwplotka commented Apr 22, 2020

  1. Add consistency_hints to SeriesResponse inside oneof

I vote for this one.

However, I think it would be nice if we would clarify the situation here. This change itself (hints in oneof) is NOT a braking change from gRPC perspective. The problem is that we don't handle it properly in Thanos clients. This is well explained here but I think it's a good candidate for weekend blog post (:

So yes, for queries it will be breaking only because we have a bug we have to fix anyway:

Following code should look like this:

 func (s *seriesServer) Send(r *storepb.SeriesResponse) error { 
 	if r.GetWarning() != "" { 
 		s.warnings = append(s.warnings, r.GetWarning()) 
 		return nil 
 	} 
  
 	if r.GetSeries() != nil { 
                s.seriesSet = append(s.seriesSet, *r.GetSeries()) 
 		return nil
 	} 
        
        // Unsupported field, skip.
       return nil
 } 

So IMO we should fix this bug and put it inside oneof.

2. Add consistency_hints to SeriesResponse as a sibling of oneof

Fully agree, super confusing.

3. gRPC metadata

Agree, brings lot's of surprises vs typed proto based approach.

However, the BIG advantage is that with this unsure behavior that B below explains is SOLVED (:

To sum up

Agree with you and vote for option 1, now there are more questions:

A. name of the field and topic consistency_hints ... does it mean we have to have separate frame for other hints? 🤔 I know I proposed this but I feel like just hint might be more extensible.... Then in some other proto we can version message hint{ ConsistencyHints consistency = 1; }
etc

B. How we should proceed if we see multiple hints in one response (many frames with different hints ;p).

@pracucci
Copy link
Contributor Author

Agree with you and vote for option 1

👍

A. name of the field and topic consistency_hints ... does it mean we have to have separate frame for other hints?

I agree on calling it hints and storing any hint.

I would suggest to define another protobuf message for the BucketStore hints (in a separate .proto file and package) so that we could easily leverage on versioning. Thoughts?

How we should proceed if we see multiple hints in one response

I would discourage to support this. I know we can't formally enforce it but we could document that a Series() response should not contain more than 1 frame with hints and that if there are 2+ the first one is kept (others are ignored). Wouldn't be enforced by protobuf, just by the implementation. Makes sense?

@bwplotka
Copy link
Member

I would suggest to define another protobuf message for the BucketStore hints (in a separate .proto file and package) so that we could easily leverage on versioning.

SGTM

I would discourage to support this. I know we can't formally enforce it but we could document that a Series() response should not contain more than 1 frame with hints and that if there are 2+ the first one is kept (others are ignored). Wouldn't be enforced by protobuf, just by the implementation. Makes sense?

I mean that's our only choice I guess (:

@bwplotka
Copy link
Member

What if during the long response block layout changes? Maybe we should take last one just in case? ;p

@pstibrany
Copy link
Contributor

I think possibility to send different hints in multiple message is something we should not discourage or treat as error. Client should simply ignore any hint it doesn't understand. If there are multiple hints, client will use all the hints it can.

@pstibrany
Copy link
Contributor

(Also, I vote for 1, looks like best option to me)

@pracucci
Copy link
Contributor Author

What if during the long response block layout changes? Maybe we should take last one just in case? ;p

In this scenario, last looks safer.

I think possibility to send different hints in multiple message is something we should not discourage or treat as error.

It really depends on how you structure the hint. Being an opaque structure, I agree it's difficult to say in advance in the SeriesResponse contract. May be we should just say that it's implementation specific and leave the final decision (whether multiple hints are allowed or not) to the specific implementation?

@pstibrany
Copy link
Contributor

It really depends on how you structure the hint. Being an opaque structure, I agree it's difficult to say in advance in the SeriesResponse contract. May be we should just say that it's implementation specific and leave the final decision (whether multiple hints are allowed or not) to the specific implementation?

I would like to avoid the same trap. If we're too strict and don't allow multiple hints now, we may find that we need them in the future, but clients cannot handle them. I would explicitly allow multiple hints, and how exactly they are handled is specific to the hint. (Saying basically same as you)

@bwplotka
Copy link
Member

bwplotka commented Apr 22, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants