Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libexpr: make "or"-as-variable less bogus #11121

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rhendric
Copy link
Member

The previous place where OR_KW was inserted into the grammar to allow expressions like "map or [...]" led to a number of weird outcomes. By moving it to expr_simple, expressions using "or" as a variable are now parsed consistently with the rest of the language. Conflicts are prevented by telling Bison that OR_KW has higher precedence than '.'.

Fixes #11118.

If this is a ‘breaking change’, then I'm going to have to describe the existing behavior in the language reference. I'm not looking forward to that, because the existing behavior is bonkers.

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@rhendric rhendric requested a review from edolstra as a code owner July 16, 2024 21:18
@L-as
Copy link
Member

L-as commented Jul 16, 2024

Truly horrifying behavior, but it does classify as a breaking change I would say unfortunately.

The previous place where OR_KW was inserted into the grammar to allow
expressions like "map or [...]" led to a number of weird outcomes. By
moving it to expr_simple, expressions using "or" as a variable are now
parsed consistently with the rest of the language. Conflicts are
prevented by telling Bison that OR_KW has higher precedence than '.'.
@@ -263,6 +262,9 @@ expr_simple
else
$$ = new ExprVar(CUR_POS, state->symbols.create($1));
}
| /* Backwards compatibility: because Nixpkgs has a rarely used
Copy link
Member

@roberth roberth Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| /* Backwards compatibility: because Nixpkgs has a rarely used
| /* Backwards compatibility: because Nixpkgs has a

Not for Nix to judge. (was also in the existing text)

@roberth
Copy link
Member

roberth commented Jul 16, 2024

Is it truly breaking though? Does it change a non-error behavior into a different behavior?

What would be a breaking change is to disallow or in bindings. That works, is used in lib.or and is possible to use even in let, although the corresponding reference is ugly: let or = 1; in ({ inherit or; }).or.
If I understand correctly, this PR lets us use normal reference syntax for bindings named or.

If we choose to do this, it does close some other doors.
For instance we could have made ({ a }: a or null) {} evaluate to null, or even let a = null; in a or 1 evaluate to 1.
However, I don't think these behaviors would be desirable. The first one degrades our scope rules, which are already complicated enough due to with. The latter doesn't combine well with ExprSelect, so I can think we can rule that out too. Specifically, a.b.c or d.e or x already has a meaning, ExprSelect(a, ["b", "c"], ExprSelect(d, ["e"], ExprVar "x")), but we'd have to explicitly disambiguate that it's not ExprSelect(a, ["b", "c"], ExprOr(ExprSelect(d, ["e"]), ExprVar "x")). Worse, users would have to understand that distinction.

So I think the only thing we're looking for is an example of a true breaking change. I can't think of one right now, but I think we could analyze the grammar to make sure?

@rhendric
Copy link
Member Author

Does it change a non-error behavior into a different behavior?

I am honor-bound to report that yes, it does:

builtins.length (let or = 1; in [ (x: x) or ])

evaluates to 1 currently and 2 with this patch.

I think it's extraordinarily unlikely that this will cause anyone any problems, but if one is inclined to hew absolutely to the letter of the law with no exceptions, this is a breaking language change.

@roberth
Copy link
Member

roberth commented Jul 16, 2024

evaluates to 1

That is horrible and was surely never intended, but yes, we should be very careful about this kind of thing.
I'd like to deprecate buggy syntax constructs like this or [ 0x10 ], so that they can eventually be replaced in a number of years, or as a preparation for language versioning.
Do we have more examples?

@roberth
Copy link
Member

roberth commented Jul 16, 2024

Could we distinguish the broken cases from the sensible ones? let or = 1; in map or l is clear and unambiguous. Maybe that could be implemented by changing the grammar for list items to behave like before?

@roberth roberth added language The Nix expression language; parser, interpreter, primops, evaluation, etc breaking Changes we can't make without breaking old expressions, changing hashes, etc labels Jul 16, 2024
@roberth
Copy link
Member

roberth commented Jul 16, 2024

I suppose another workaround is to allow specifically (or) as a syntax for ExprVar("or"). Those four characters seems to be consistently a syntax error, at least in lists. I guess we have to be careful about inherit (or) a1; though.

@rhendric
Copy link
Member Author

rhendric commented Jul 17, 2024

Do we have more examples?

Yeah, this is 4 currently and an error with the patch:

let or = 1; in (x: x * 2) (x: x + 1) or

Here's a really cursed one: 2 currently and an error with the patch:

let or = 1; in { a = 2; }.a or (x: x) or

Could we distinguish the broken cases from the sensible ones? let or = 1; in map or l is clear and unambiguous. Maybe that could be implemented by changing the grammar for list items to behave like before?

As the above shows, it's not just list elements that are affected. But if you're asking whether we could detect uses of or that are parsed differently in this patch and present a warning or something... I think so, but I'd have to work it out.

I suppose another workaround is to allow specifically (or) as a syntax for ExprVar("or"). Those four characters seems to be consistently a syntax error, at least in lists. I guess we have to be careful about inherit (or) a1; though.

This patch does enable that, but I'm not seeing how that alone (without any other grammar changes) would fix the current issues with bad parses. It would just enable users to use or in more places, which I'm actually slightly against; I'm just trying to support the map or l syntax we've committed to without the disturbing side effects. If you wanted the existing bad parses to be syntax errors instead, I could probably rig that—it'd still be a breaking change, though, as all of these examples show.

Copy link
Member

@inclyc inclyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a bug? For a long time parser developers of nix language were considering this is a feature of Nix language and maybe we shouldn't fix that feature. (Both nixd and nil can correctly deal with that strange precedence).

In my humble opinion, if this should be "fixed" maybe just mark OR_KW a keyword. And just don't allow keyword used as variable names, that's straightforward way of many other languages. For backward compatibility, we could introduce a new Xp options that disallow keywords like or used as variable names, and waiting for migration. (We have already done that for "url-literals").

That weird or production mainly used for backward compatibility. Then why should we introduce a new "breaking" change to break that compatibility?

@rhendric
Copy link
Member Author

Behavior that is both confusing and unintentional is not a feature.

If we didn't have to support any expressions that use or as a variable, I'd agree with making it a pure keyword. But assuming that we do, it's better to do so in a not-confusing way than in a confusing way.

This is a ‘breaking’ change only for cases that are exceedingly unlikely to arise in code that wasn't specifically written to be a Nix brainteaser or parser test case. I agree that caution is merited even in such cases, but not so much caution that we're forever trapped with the mistakes of ten years ago. We should be able to fix things that are both confusing and unintentional; it reflects poorly on the project if we can't.

@inclyc
Copy link
Member

inclyc commented Jul 26, 2024

Behavior that is both confusing and unintentional is not a feature.

The behavior is intentional I suppose, it even have some comments in parser.y, that's fairly rare.

This is a ‘breaking’ change only for cases that are exceedingly unlikely to arise in code that wasn't specifically written to be a Nix brainteaser or parser test case.

IIRC, OR_KW used for that production mainly used for supporting or used as a variable as a part of map or. I just grepped a whole nixpkgs and there is NO such case in fact, so kindly I suggest "marking it as a pure keyword" solution.

@roberth
Copy link
Member

roberth commented Jul 26, 2024

Personally, I'd be ok doing this with a long deprecation cycle where we warn about the upcoming change for a couple of years.

bug?

not a feature.

I think the term is "legacy", and that is - in part - a good thing; almost as good as removing it responsibly.

Very responsibly, because Nix promises to be reproducible, and that's an empty claim if it's hard to keep up with its changes, and especially changes in the language, which - in principle - we should have none of.

@rhendric
Copy link
Member Author

Personally, I'd be ok doing this with a long deprecation cycle where we warn about the upcoming change for a couple of years.

Good enough for me!

Which ‘this’, though—making or a less-bogus contextual keyword as in this PR, or making or a pure keyword? I'm fine with either outcome, if we're going through a deprecation cycle.

@roberth
Copy link
Member

roberth commented Jul 26, 2024

making or a less-bogus contextual keyword as in this PR

This involves preserving the current behavior, and warning about the cases where the current behavior doesn't match the behavior we want.
Probably much easier said than done.

making or a pure keyword?

This seems easier to implement, but also less valuable and potentially more disruptive.

@tomberek
Copy link
Contributor

I'd say that if we can't find a Nixpkgs where this change matters for drvPaths, we fix this as a bug, rather than considering a breaking language change.

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-08-28-nix-team-meeting-minutes-173/51302/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Changes we can't make without breaking old expressions, changing hashes, etc language The Nix expression language; parser, interpreter, primops, evaluation, etc
Projects
Status: ⚖ To discuss
Development

Successfully merging this pull request may close these issues.

Incorrect parsing of or as a variable in expressions
6 participants