Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object embedding should be its own middleware #11

Closed
faultyserver opened this issue Jul 27, 2016 · 8 comments
Closed

Object embedding should be its own middleware #11

faultyserver opened this issue Jul 27, 2016 · 8 comments

Comments

@faultyserver
Copy link
Member

While it makes sense to embed objects in Conductor (as it is the reason that embeds need to exist), it makes more sense to me that embedding takes place in a separate middleware, that would go before Conductor in the stack.

My current reasoning is this:

  • Embedding is not a sub-process of publishing related events (the purpose of Conductor). Creating a separate middleware changes the decision from an AND to an OR, which is nice.
  • Other middlewares could potentially require embeds, while others might not want those embeds to exist. Tying the process into Conductor means middlewares before Conductor in the stack will not have embedded objects, while everything after it will have embedded objects. Creating a separate middleware gives more control over this situation.
faultyserver added a commit that referenced this issue Jul 27, 2016
This allows middlewares that are further up the stack to use the modifications that Conductor makes (such as embedding objects).

This necessity of this might change, though. See #11.
@elliottwilliams
Copy link
Member

Agreed. I was reading through Conductor yesterday because I need to check which events triggered associations, and I think it'd fit in better as its own middleware. It'd certainly make it easier to ascertain how associations are made from the code.

@faultyserver
Copy link
Member Author

After writing a bit of the implementation, I think it makes more sense to place this new middleware after Conductor in the stack. That way, any events that Conductor fires will also go through the embedding process, making the events that get published more consistent.

So now, the stack is

Transport
ObjectEmbed
Conductor
Agency (the core system)

faultyserver added a commit that referenced this issue Aug 5, 2016
Includes the definition of the ObjectEmbed class, as well as a configuration file and additions to the citybus configuration to use the middleware.
faultyserver added a commit that referenced this issue Aug 6, 2016
See #11. Having ObjectEmbed come after Conductor ensures that objects are embedded appropriately in all events that come out from the system.

Since Conductor is responsible for publishing new events based on existing ones, if ObjectEmbed were to come before it, then those new events would either not have embedded objects, or Conductor would have to embed them manually (neither of which is a good solution).
@faultyserver
Copy link
Member Author

Having gotten most of the way through the implementation for this, I'm thinking have a more-generic Serialization middleware is more fitting than one dedicated to "embedding objects", which doesn't actually do anything until an object gets serialized to a hash anyway (an Object instance is a plain Object instance until it gets serialized. The embedding we're trying to accomplish doesn't change anything about the Object instance, it just replaces values in the serialized Hash).

So, my proposal is to change ObjectEmbed to Serializer, and have each Shark::Object subclass that wants to embed information implement serialize_for(event, opts={}). It seems like object embeds are mostly event-agnostic, but having the event parameter will enable potential specifications later on.

I like the idea of having the serialization be something that an Object does itself. The to_h implementation started this idea from the very beginning, where ObjectManager#fire was providing object.to_h as the first argument to an event, the rationale being "it has to happen sometime; why not now?". This has since been changed on the object_embed branch to provide Object instances as parameters right up until events get published by Transport. Having the logic internal to an Object also makes it easier for other middlewares to effect changes in the serialization of an Object (by setting flags or other means). I can't think of any uses for this, but it could happen.

On the other hand, I don't like the idea of having serialization handled by Object classes because it's a bundling of view logic into a model. Not that it's necessarily terrible, but the two are rather unrelated. The only actual advantage I can currently see to having it in the Object classes is better obeying the Lawguideline of Demeter, since almost all of the necessary attributes will be directly accessible. The advantages to having serialization logic managed externally include unbundling/isolating the serialization logic, and better configurability (with Shark::Configurable); while underwhelming, these are certainly nice to have.

tl;dr: ObjectEmbed will become Serializer, which will map all of the args/kwargs for an event with Object#serialize_for(event_type) or Serializer#serialize(arg, for: event_type) (decision pending on which). The change is occurring for better semantics in that embedding requires serialization to actually have any effect.

@elliottwilliams
Copy link
Member

That makes sense to me. Since all serialized views from Shark represent a single object, it doesn't feel like you are bundling in any logic that isn't already directly related to the object's properties.

@faultyserver
Copy link
Member Author

My qualm with having it as part of Shark::Object was that how exactly an object gets serialized (which attributes are included, whether associations are followed, etc.) is dependent on its context.

For example, a Route object as a top-level argument will be serialized with all of its attributes and its associated objects. However, when it is embedded inside a Vehicle object, it only includes a few attributes (identifier, name, short_name, color) and does not include its associated objects.

This context dependency seems to mean that serialize_for will either have to take a context parameter of some sort (or an option like simplified: true) or objects that embed other object's attributes will have to do that embedding manually, which is somewhat brittle.

I'm still a little torn on where the implementation should go, but I'm going to try writing it as part of Serializer first, and see how that goes. If it feels messy/awkward, I'll move it into Shark::Object.

faultyserver added a commit that referenced this issue Aug 9, 2016
Per #11, the `ObjectEmbed` middleware is more aptly named `Serializer`, since object embedding has no actual effect until an Object is serialized, meaning the middleware would be serializing objects before doing any embedding.

Classifying the middleware as `Serializer` covers this process semantically, and generalizes the middleware, allowing it to do more than just embed objects (though, it probably won't any time soon).
@faultyserver
Copy link
Member Author

Alright. I'm not entirely convinced that having serialization inside of Shark::Object is best, but it is definitely the easier solution, so I'll be implementing that instead.

Serializer will still provide configuration options for serialization through Serializer.configure, but these will simply be passed on to the instance's to_json method to deal with as it pleases.

The advantage to this approach is simplicity: All native Ruby objects (numerics, strings, arrays, etc.) have a default to_json implementation, and Shark::Object#to_json already takes care most Objects; all that has to be written is specializations for partial embedding routes on vehicles, stations on routes, and other such embeds.

@faultyserver
Copy link
Member Author

Once again, I'm having different thoughts now.

to_json doesn't actually work desirably, here: it returns the JSON string representation of an object rather than the hash. I had confused this with as_json, which would actually do what I want here (convert Objects to hashes but leave simple types as-is).

However, native objects do not implement as_json (nor do they implement to_h), meaning that method can not be consistently used, and we're back to square one with a large case statement to do type checking.

ruby_wamp_client (the WAMP gem I'm using) doesn't like taking string arguments (it expects args/kwargs as an array/hash, and calls to_json on them).


With that, I'm debating whether Serializer is really necessary at all.

While it provides opt-in functionality and a nice, isolated configuration scheme, it really doesn't do much, since the WAMP client expects plain objects and calls to_json on them anyway.

I think I'm going to try adding Shark::Configurable to Shark::Object and having a default configuration that looks something like:

Shark::Object.configure do |config|
  ...
  ### Serialization ###

  # Set to true to embed all objects in a one-to-many association. For example:
  # when true, a Route will embed object information for each Station in its
  # `stations` attribute; when false, the attribute would be kept as a list of
  # identifiers.
  config.embed_has_many_associations = true
  # The number of levels through which object embedding will occur.
  # A value of 0 will perform no embedding on Objects.
  # A value of 1 will embed Objects that are direct attributes of an Object.
  # Values higher than 1 will embed further-nested Object attributes.
  config.embed_depth = 1
end

For special cases, the configuration can be overwritten both at the class level, and the instance level:

# Class-level specialization
Shark::Route.configure do |config|
  config.embed_has_many_associations = false
end

# Instance-level specialization
vehicle = Shark::Vehicle.new
vehicle.configure do |config|
  config.embed_depth = 3
end

These options will all get parsed by the to_json method on Shark::Object by default (or by any subclass implementation) when the WAMP client calls it on each argument for an event.

faultyserver added a commit that referenced this issue Aug 12, 2016
The `ObjectSerialization` module includes all of the logic necessary for serializing objects, including configurability through a number of options (outlined in `config/serialization.rb`).

Somewhat awkwardly, `Object` is now indirectly tied to `Storage`, which means that a given Shark instance can only use one Storage instance. This was already the case, but by choice, not by requirement (though, having different adapters for a given instance would be rather odd. Sharding, for example, should be taken care of by the adapter itself, rather than configuration).
@faultyserver
Copy link
Member Author

Closed by #18.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants