Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parsing logic for converting text to Expression #1079

Merged
merged 18 commits into from
May 14, 2020
Merged

Conversation

kitchoi
Copy link
Contributor

@kitchoi kitchoi commented May 12, 2020

This PR works towards finishing item 4 in #977

Text parser for parsing traits domain specific language to an instance of Expression

  • Add conversion logic from a syntax tree to an instance of Expression
  • Add parse function to convert text to Expression. This function is exposed in the public API
  • Add trait as a method on Expression for extending an expression for observing a trait with a given name.
  • Add trait top-level function for creating an expression for a given trait name (this just wraps the method on an empty expression).

Note that parsing "+metadata" and "items" currently raise NotImplementedError as the observers required for these two are still being reviewed. This PR therefore adds some placeholders for them.

Checklist

  • Tests
  • Update API reference (docs/source/traits_api_reference)
  • Update User manual (docs/source/traits_user_manual)
  • Update type annotation hints in traits-stubs

@kitchoi kitchoi changed the title Implement parsing logic for creating an instance of Expression Implement parsing logic for converting text to Expression May 12, 2020
traits/observers/parsing.py Outdated Show resolved Hide resolved
@mdickinson mdickinson self-assigned this May 13, 2020
@mdickinson
Copy link
Member

Assigning to myself for review.

@mdickinson mdickinson self-requested a review May 13, 2020 12:32

e.g. ``trait("child").trait("age")`` matches ``child.age``
on an object, and is equivalent to
``trait("child").then(trait("age"))``
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we stick with just one way to do it here? I think trait("child").then(trait("age")) should be fine for now; we could add the trait("child").trait("age") support later if we think it's needed.

In some sense, this is a violation of the OCP: then is a general piece of machinery that lets us chain expressions, while if we follow this pattern we'll want an extra method for each new type of observer - we end up repeating ourselves, and each new observer type expects a new method in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, there isn't a repetition here, because of the _as_top_level magic that converts the method into a matching function. I'd recommend removing the magic and the method, and having a direct top-level function for creating the expression. That'll make for simpler, more direct code that's easier to maintain going forward.

Copy link
Contributor Author

@kitchoi kitchoi May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very much a user interface (just not graphical or command-line-like), a bit like a main application. It seems laborious to do trait("child").then(trait("attr1")).then(trait("attr2")).then(trait("attr3")) when we could do trait("child").trait("attr").trait("attr2")

Likewise, this is more user-friendly:
trait("child").metadata("name")
than
trait("child").then(metadata("name"))
The latter also requires importing metadata in addition to trait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@corranwebster Since you have reviewed the user experience before, thought I should check again if we are changing this. Is it okay if we ditch trait("name1").trait("name2").trait("name3") but require the user to go with trait("name1").then(trait("name2")).then(trait("name3")) instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This will apply to all the other features, e.g. metadata and filter_)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with this sort of change to the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This means a lot of the tests I have written for future PRs will need to change. Sigh.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with keeping the .trait method, FWIW, just so long as we untangle so that it's clear that it's the .trait method that depends on the function, and not the other way around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will keep that then.

""" Handle trees when the notify is not immediately specified
as a suffix. The last notify flag will be used.

e.g. In "a.[b,c]:.d", the element "b" should receive a notify flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the :. here intentional, or a typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a typo. Thanks!


import traits.observers.expression as _expr_module
from traits.observers._generated_parser import (
Lark_StandAlone as _Lark_StandAlone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my, that's horrible. Camel_CaseWithUnderscores? Not much we can do about that, I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the name in the generated parser is out of our control. I suppose we could change the alias name...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: why are we aliasing this in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this parsing is a public module, and I got overly excited about hiding variables from TAB auto-completion... except I failed to do this 100% because the standard library imports aren't given private names.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. We don't have any policy in place about doing this in general; I think that given that we rarely ever do a from ... import *, it's probably not needed.

_LARK_PARSER = _Lark_StandAlone()

#: Token annotation for a name (a trait name, or a metadata name, etc.)
_NAME_TOKEN = "NAME"
Copy link
Member

@mdickinson mdickinson May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we should be importing from somewhere? Where/how does Lark store the string constants it uses for token types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "NAME" is data from the grammar, and so it is going to be somewhere in the generated code DATA or MEMO variable. However those two structures are not really for human consumption.

On the other hand, this _NAME_TOKEN is here only for sanity check. I can remove it too and just not check it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd if Lark exposes token.type as part of the public API but doesn't expose the constants for token.type to be compared to. How are consumers supposed to use token.type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not being clear...
I meant the "NAME" is defined by us:

NAME: /[a-zA-Z_]\w*/

If I change the grammar file such that "NAME: /[a-zA-Z_]\w*/" becomes "NAME2: /[a-zA-Z_]\w*/" (and everywhere else that uses it), then regenerate the standalone parser. This _NAME_TOKEN = "NAME" will need to change to _NAME_TOKEN = "NAME2".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is okay, I am going to remove this sanity check. It is on the border of being not really necessary given we already have tests for the parsed outcomes.

Copy link
Contributor Author

@kitchoi kitchoi May 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sanity check for "NAME" is not covered in tests. Likewise, this line is not covered either.

raise RuntimeError("Default notify flag unexpectedly changed.")

Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM. I think this is baking in some of the Expression logic that we're going to want to change before the release, but that may be clearer when everything's together.

Please could we move the logic for the trait function into the function itself, and modify the trait method to use that function rather than the other way around? I think that'll let us get rid of the decorator magic, and leave us better prepared for the later changes.

@kitchoi
Copy link
Contributor Author

kitchoi commented May 14, 2020

Thank you for the review. I hope I have addressed all the comments.

@kitchoi kitchoi requested a review from mdickinson May 14, 2020 12:43
Copy link
Member

@mdickinson mdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kitchoi
Copy link
Contributor Author

kitchoi commented May 14, 2020

Thank you. Merging...

@kitchoi kitchoi merged commit fa56823 into master May 14, 2020
@kitchoi kitchoi deleted the 977-parsing branch May 14, 2020 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants