Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce model translation and encoding interfaces #3200

Merged
merged 16 commits into from
May 20, 2021

Conversation

jrcamp
Copy link
Contributor

@jrcamp jrcamp commented May 17, 2021

This is a somewhat more limited version of #3044. It decouples
translation of models from encodings.

Models refers to the in-memory representation of a protcol like the zipkin v2
SpanModel. After translating from pdata to the model an encoding is used to
serialize the model to a particular byte representation (protobuf, JSON, etc.).
The reverse also applies in deserializing an encoding of bytes to a model then
translating the model to pdata.

The goal is to be able to have a generic way of translating and encoding to/from pdata from supported types (zipkin, Jaeger, etc.) without having to have explicit knowledge about all those types. For instance kafkareceiver/kafkaexporter can then be made to support any type that adheres to the interfaces without having to have knowledge about any of them specifically.

Open questions

* Currently encoding/decoding are separate interfaces which means you don't have to implement both. Model translation is currently a single interface which means you have to implement both. Should requiring two-way conversion be mandated by the interfaces like in translation or should you be able to implement only one-way conversions like the encode/decode interfaces are?

They were made separate interfaces.

This is a somewhat more limited version of open-telemetry#3044. It decouples
translation of models from encodings.

Models refers to the in-memory representation of a protcol like the zipkin v2
SpanModel. After translating from pdata to the model an encoding is used to
serialize the model to a particular byte representation (protobuf, JSON, etc.).
The reverse also applies in deserializing an encoding of bytes to a model then
translating the model to pdata.
@project-bot project-bot bot added this to In progress in Collector May 17, 2021
protocols/encodings/decoder.go Outdated Show resolved Hide resolved
protocols/encodings/encodings.go Outdated Show resolved Hide resolved
protocols/encodings/encodings.go Outdated Show resolved Hide resolved
protocols/encodings/encoder.go Outdated Show resolved Hide resolved
protocols/encodings/encodings.go Outdated Show resolved Hide resolved
protocols/encodings/encoder.go Outdated Show resolved Hide resolved
protocols/models/models.go Outdated Show resolved Hide resolved
protocols/models/models.go Outdated Show resolved Hide resolved
protocols/models/models.go Outdated Show resolved Hide resolved
protocols/encoding/encoding.go Outdated Show resolved Hide resolved
protocols/encoding/decoder.go Outdated Show resolved Hide resolved
Before the encoder interfaces took pdata and serialized it by calling the
translator. This fully separates the concerns by having model do pure
translation, encoding do pure serialization, and the transcoder doing both.

Without this there was no way to use the encoder if you already had a model.

Only updates traces for feedback purposes.
_ encoding.TracesDecoder = (*Encoder)(nil)
)

type Encoder struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map[encoding.Type]Encoder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's possible but will require some helpers. Let's punt on it for now though as it doesn't impact the external API so can be easily tidied up later.

* standardized on encoding/decoding terminology
* renamed encodings to bytes to avoid confusion with encode/decode terminology
* added high level interfaces to top level protocols package that goes directly pdata <-> bytes
protocols/bytes/encoding.go Outdated Show resolved Hide resolved
protocols/README.md Outdated Show resolved Hide resolved
protocols/decoder.go Outdated Show resolved Hide resolved
protocols/decoder.go Outdated Show resolved Hide resolved
protocols/models/models.go Outdated Show resolved Hide resolved
@bogdandrutu
Copy link
Member

@jrcamp later we may want to add a "compressor" interface for things like https://github.com/open-telemetry/opentelemetry-collector/issues/3223. Or maybe we can use a standard library for that.

@jrcamp jrcamp changed the title [wip] Introduce model translation and encodings interfaces Introduce model translation and encoding interfaces May 19, 2021
@jrcamp jrcamp marked this pull request as ready for review May 19, 2021 03:16
@jrcamp jrcamp requested a review from a team as a code owner May 19, 2021 03:16
@jrcamp jrcamp requested a review from bogdandrutu May 19, 2021 18:50
protocols/README.md Outdated Show resolved Hide resolved
protocols/README.md Outdated Show resolved Hide resolved
protocols/README.md Outdated Show resolved Hide resolved
protocols/encoding/decoder.go Outdated Show resolved Hide resolved
protocols/translation/decoder.go Outdated Show resolved Hide resolved
protocols/translation/decoder.go Outdated Show resolved Hide resolved
protocols/translation/encoder.go Outdated Show resolved Hide resolved
Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top directory should be "model" instead of "protocols", see #3104 for the decision of the package.

model/translator/decoder.go Outdated Show resolved Hide resolved
model/translator/encoder.go Outdated Show resolved Hide resolved
model/translator/decoder.go Outdated Show resolved Hide resolved
@bogdandrutu
Copy link
Member

bogdandrutu commented May 20, 2021

Some feedback based on my quick PR to fix usage of internal protos in fileexporter:

  • Because we accept/return byte[] instead of io.Reader/io.Writer need to do extra allocation and copy of the bytes
  • I feel that "Serializer -> Marshaler" and "Deserializer -> Unmarshaler" makes more sense now :)

See #3238

@jrcamp
Copy link
Contributor Author

jrcamp commented May 20, 2021

Some feedback based on my quick PR to fix usage of internal protos in fileexporter:

  • Because we accept/return byte[] instead of io.Reader/io.Writer need to do extra allocation and copy of the bytes

I looked into the io.{Reader,Writer} earlier and from what I could tell protobuf (binary representation) didn't appear to support them and dealt in []byte. And most of the existing serialization code seemed to return []byte which is why I went that route. (Not saying it's correct, just to explain how I got here).

I think both might have their place and maybe the answer is to support both. My concern about only supporting Reader/Writer would be internally the encoder is dealing in bytes, has to return bytes.NewReader(buf) then the user wants to deal in []byte so they ioutil.ReadAll(reader) which kind of defeats the whole purpose. :)

Perhaps both could be supported though?

// MetricsMarshaler encodes protocol-specific data model into bytes.
type MetricsMarshaler interface {
	MarshalMetrics(model interface{}) (io.Reader, error)
	MarshalMetricsBytes(model interface{}) ([]byte, error)
}

The protocol would implement one then the default implementation of the unimplemented one would call the implemented one and wrap/unwrap bytes as needed. But that way if you call MarshalMetricsBytes and internally it's dealing in bytes and you want bytes you don't have to double allocate. Maybe.

  • I feel that "Serializer -> Marshaler" and "Deserializer -> Unmarshaler" makes more sense now :)

See #3238

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good as a first step. Can you for the moment put it in the internal/model so we don't expose it publicly until we know the exact path we want?

Collector automation moved this from In progress to Reviewer approved May 20, 2021
@bogdandrutu
Copy link
Member

bogdandrutu commented May 20, 2021

internal/model/translator/translator.go:32:58: unexported-return: exported func NewErrIncompatibleType returns unexported type *translator.errIncompatibleType, which can be annoying to use (revive)
func NewErrIncompatibleType(expected, given interface{}) *errIncompatibleType {
                                                         ^

Just return error :)

@bogdandrutu bogdandrutu merged commit db093a6 into open-telemetry:main May 20, 2021
Collector automation moved this from Reviewer approved to Done May 20, 2021
dashpole pushed a commit to dashpole/opentelemetry-collector that referenced this pull request Jun 14, 2021
)

* [wip] Introduce model translation and encodings interfaces

This is a somewhat more limited version of open-telemetry#3044. It decouples
translation of models from encodings.

Models refers to the in-memory representation of a protcol like the zipkin v2
SpanModel. After translating from pdata to the model an encoding is used to
serialize the model to a particular byte representation (protobuf, JSON, etc.).
The reverse also applies in deserializing an encoding of bytes to a model then
translating the model to pdata.

* use encode/decode consistently

* review feedback

* decouple encoding from model

Before the encoder interfaces took pdata and serialized it by calling the
translator. This fully separates the concerns by having model do pure
translation, encoding do pure serialization, and the transcoder doing both.

Without this there was no way to use the encoder if you already had a model.

Only updates traces for feedback purposes.

* add high level interface

* standardized on encoding/decoding terminology
* renamed encodings to bytes to avoid confusion with encode/decode terminology
* added high level interfaces to top level protocols package that goes directly pdata <-> bytes

* cleanup

* Apply suggestions from code review

Co-authored-by: Tigran Najaryan <[email protected]>

* renamings

* reword error

* cleanup

* return interface instead of out parameter

* review feedback

* serialize -> marshal

* put in internal

* lint

Co-authored-by: Tigran Najaryan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Collector
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants