Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

De-conflate local from remote service names #2484

Merged
merged 25 commits into from
Apr 14, 2019

Conversation

codefromthecrypt
Copy link
Member

@codefromthecrypt codefromthecrypt commented Apr 3, 2019

Especially mentioned in #1794, local and remote service names are undeniably serving different use cases, yet currently conflated in the api.

/services -> only local serviceNames
/remoteServices?serviceName=X -> new: only remote serviceNames for auto-complete
/spans?remoteServiceName=X -> new: to restore functionality @llinder mentioned

cc @openzipkin/ui @bulicekj

@codefromthecrypt
Copy link
Member Author

oh wow.. cassandra tests pass now :) someone quick take a photo!

return new Metadata(compactionClass);
return new Metadata(
compactionClass,
hasUpgrade1_autocompleteTags(keyspaceMetadata),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeagord you and I both forgot to check this!

@@ -77,12 +90,19 @@
try (CassandraStorage storage = CassandraStorage.newBuilder()
.contactPoints(contactPoint.getHostString() + ":" + contactPoint.getPort())
.ensureSchema(false)
.autocompleteKeys(asList("environment"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zeagord here was the secret to causing us to figure out if we really work when schema isn't updated

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) I totally forgot about this.

@codefromthecrypt
Copy link
Member Author

I am nearly done the hard parts. just have to implement query support for cassandra and cassandra v1. after that, it is easy

@codefromthecrypt
Copy link
Member Author

codefromthecrypt commented Apr 6, 2019

@openzipkin/cassandra I'm a little stuck on the best way to model the remoteServiceName=X query.

I'm thinking about adding this for the remote service index. only other option I can think of is adding a SASI field on spans of r_service. thoughts?

CREATE TABLE IF NOT EXISTS zipkin2.trace_by_service_remote_service (
    service       text,             //-- service name, or blank to support queries without service name
    remote_service  text,             //-- remote service name
    bucket        int,              //-- time bucket, calculated as ts/interval (in microseconds), for some pre-configured interval like 1 day.
    ts            timeuuid,         //-- start timestamp of the span, truncated to millisecond precision
    trace_id      text,             //-- trace ID
    PRIMARY KEY ((service, remote_service, bucket), ts)
)
   WITH CLUSTERING ORDER BY (ts DESC)
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy'}
    AND default_time_to_live =  259200
    AND gc_grace_seconds = 3600
    AND read_repair_chance = 0
    AND dclocal_read_repair_chance = 0
    AND speculative_retry = '95percentile'
    AND comment = 'Secondary table for looking up a trace by remote service name. service column may be blank (when only looking up by remote_service). bucket column adds time bucketing to the partition key, values are microseconds rounded to a pre-configured interval (typically one day). ts column is start timestamp of the span as time-uuid, truncated to millisecond precision.';

Note: this would be in addition to trace_by_service_span, and require a client-side join if other parameters are sent (similar to how we do with annotation query).

Another option is to extend trace_by_service_span multiplying rows there by remote service name. This would be simpler to query... Not sure what's better? throwing this out to get feedback.


CREATE CUSTOM INDEX IF NOT EXISTS ON zipkin2.trace_by_service_span (duration) USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {'mode': 'PREFIX'};
CREATE TABLE IF NOT EXISTS zipkin2.remote_service_by_service (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is adding the tables trace_by_service_remote_service and remote_service_by_service the only actual change in the file's diff?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the other changes were moving the indexing near the tables they indexed as it was confusing me (at the cost of confusing you in this review.. sorry)

@codefromthecrypt
Copy link
Member Author

cassandra 3.11.3+ done (except some unit tests which all around need to be added but saving for later)
next up cassandra 2.2+ impl (cassandra-v1)

codefromthecrypt pushed a commit that referenced this pull request Apr 10, 2019
In #2484, we will stop indexing the remote service conflated with the
local service. This refactors the index parsing logic to allow that
change to be visible. This also corrects for some drift noticed while
refactoring.
@codefromthecrypt
Copy link
Member Author

will merge this prior to the cassandra v1 impl as it was too hard to see what changed #2492

codefromthecrypt pushed a commit that referenced this pull request Apr 10, 2019
…2492)

In #2484, we will stop indexing the remote service conflated with the
local service. This refactors the index parsing logic to allow that
change to be visible. This also corrects for some drift noticed while
refactoring.
@codefromthecrypt
Copy link
Member Author

ok cassandra v1 "done" (same quality caveat)

now I will do a quality sweep including unit tests and README updates. if the build is green, it is testable

@codefromthecrypt
Copy link
Member Author

only task left is unit test sweep.. doing that now.

@codefromthecrypt codefromthecrypt merged commit b864614 into master Apr 14, 2019
@codefromthecrypt codefromthecrypt deleted the remote_service_name branch April 14, 2019 08:16
abesto pushed a commit to abesto/zipkin that referenced this pull request Sep 10, 2019
…penzipkin#2492)

In openzipkin#2484, we will stop indexing the remote service conflated with the
local service. This refactors the index parsing logic to allow that
change to be visible. This also corrects for some drift noticed while
refactoring.
abesto pushed a commit to abesto/zipkin that referenced this pull request Sep 10, 2019
Especially mentioned in openzipkin#1794, local and remote service names are undeniably serving different use cases, yet currently conflated in the api.

/services -> only local serviceNames
/remoteServices?serviceName=X -> new: only remote serviceNames for auto-complete
/spans?remoteServiceName=X -> new: to restore functionality @llinder mentioned
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants