Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spread operator in record literals #11144

Merged
merged 20 commits into from
Nov 29, 2023
Merged

Conversation

ysthakur
Copy link
Member

@ysthakur ysthakur commented Nov 23, 2023

Goes towards implementing #10598, which asks for a spread operator in lists, in records, and when calling commands (continuation of #11006, which only implements it in lists)

Description

This PR is for adding a spread operator that can be used when building records. Additional functionality can be added later.

Changes:

  • Previously, the Expr::Record variant held (Expression, Expression) pairs. It now holds instances of an enum RecordItem (the name isn't amazing) that allows either a key-value mapping or a spread operator.
  • ... will be treated as the spread operator when it appears before $, {, or ( inside records (no whitespace allowed in between) (not implemented yet)
  • The error message for duplicate columns now includes the column name itself, because if two spread records are involved in such an error, you can't tell which field was duplicated from the spans alone

... will still be treated as a normal string outside records, and even in records, it is not treated as a spread operator when not followed immediately by a $, {, or (.

User-Facing Changes

Users will be able to use ... when building records.

> let rec = { x: 1, ...{ a: 2 } }
> $rec
╭───┬───╮
│ x │ 1 │
│ a │ 2 │
╰───┴───╯
> { foo: bar, ...$rec, baz: blah }
╭─────┬──────╮
│ foo │ bar  │
│ x   │ 1    │
│ a   │ 2    │
│ baz │ blah │
╰─────┴──────╯

If you want to update a field of a record, you'll have to use merge instead:

> { ...$rec, x: 5 }
Error: nu::shell::column_defined_twice

  × Record field or table column used twice: x
   ╭─[entry #2:1:1]
 1 │  { ...$rec, x: 5 }
   ·       ──┬─  ┬
   ·         │   ╰── field redefined here
   ·         ╰── field first defined here
   ╰────
> $rec | merge { x: 5 }
╭───┬───╮
│ x │ 5 │
│ a │ 2 │
╰───┴───╯

Tests + Formatting

After Submitting

@@ -30,7 +30,7 @@ pub enum Expr {
MatchBlock(Vec<(MatchPattern, Expression)>),
List(Vec<Expression>),
Table(Vec<Expression>, Vec<Vec<Expression>>),
Record(Vec<(Expression, Expression)>),
Record(Vec<RecordItem>),
Copy link
Member Author

@ysthakur ysthakur Nov 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It bugs me a little that there's an Expr::Spread variant, but it's not used for records, just lists. Would it be a good idea to make Expr::List also accept instances of some ListItem enum for consistency? It would mean refactoring a lot of code, though. Another possibility is keeping Vec<(Expression, Expression)> and representing spread expressions by pairing them with Nothing, but that's less safe.

@ysthakur ysthakur marked this pull request as ready for review November 28, 2023 00:10
Copy link
Member

@amtoine amtoine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really cool, thanks for working on this new operator @ysthakur 🙏

i am convinced by the tests you wrote, i'll just wait for another folk to review the changes to the parser, i'm not very comfortable with that code 😢

@ysthakur
Copy link
Member Author

i'll just wait for another folk to review the changes to the parser, i'm not very comfortable with that code 😢

I'm a bit nervous about it (and other places) too, it's not the nicest. Another place that needs to be looked at more closely is the changes to flatten.rs. I don't really understand how that works, I just copied the existing code for flattening key-value pairs in records.

i am convinced by the tests you wrote

I'm not 😄. Like you mentioned, the changes to the parser are a bit complicated and it definitely needs to be tested more, but I can't think of more edge cases because I haven't really used records until now, so if you have any suggestions, that'd be very helpful.

@ysthakur
Copy link
Member Author

Currently, the ShellError::ColumnDefinedTwice error shows where the original and duplicate columns are but doesn't name them in the error. Since records can be spread now, we can't show the exact span of the original/duplicate field in the error message anymore, only the span of the expression that the field was defined in. Would it be okay if I added the name of the column to the error message?

@amtoine
Copy link
Member

amtoine commented Nov 28, 2023

  1. it's a new feature, if we don't catch all the rough edges, that's not a big deal i think 😌
  2. yeah, having more information when we can't have the span sounds a good idea 👍

@sholderbach sholderbach added the pr:breaking-change This PR implies a change affecting users and has to be noted in the release notes label Nov 29, 2023
Copy link
Member

@sholderbach sholderbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not an expert in the FlatShape world, but things look sound to me.

The parsing and handling of Record invariants looks good to me. Good call adding the name to the error.

For the release note there is an additional breaking change to mention around how record literals now disallow duplicate column names instead of just ignoring them. I think for literals this is a fine choice to make.

Thanks for the solid amount of tests in the happy path. We may overtime want to expand that with illegal syntax in the spread position or inside the record, to harden ther parser.

RecordItem::Spread(_, inner) => {
match eval_expression(engine_state, stack, inner)? {
Value::Record { val: inner_val, .. } => {
for (col_name, val) in inner_val {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be an argument to be had that one might want to have semantics where the spread overrides a local key or you could override a value from the spread by a following key.

But for now this is the clearly definable and easier to implement route even though it may reject some combinations of existing keys and incoming data at eval-time.

Comment on lines +1620 to +1622
} else if second_token.is_some_and(|c| {
c.len() > 3 && c.starts_with(b"...") && (c[3] == b'$' || c[3] == b'{' || c[3] == b'(')
}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly one of the scarier bits in the existing parse logic.

(Local soundness seems correct, as the bounds are checked)

Comment on lines +5206 to +5213
if curr_tok.starts_with(b"...")
&& curr_tok.len() > 3
&& (curr_tok[3] == b'$' || curr_tok[3] == b'{' || curr_tok[3] == b'(')
{
// Parse spread operator
let inner = parse_value(
working_set,
Span::new(curr_span.start + 3, curr_span.end),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the logic looks sane to me!

Comment on lines -294 to -296
// avoid duplicate cols.
let col_name = value_as_string(eval_constant(working_set, col)?, expr.span)?;
record.insert(col_name, eval_constant(working_set, val)?);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change pin: This will turn duplication in const contexts into a hard error, which was not the case previously.

Copy link
Member

@amtoine amtoine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll follow @sholderbach and land this PR, let's try it in the real world 😏

thanks for working on this again @ysthakur 🙏

@amtoine amtoine merged commit 0303d70 into nushell:main Nov 29, 2023
19 checks passed
@ysthakur ysthakur deleted the spread-record branch November 29, 2023 17:33
WindSoilder pushed a commit that referenced this pull request Dec 6, 2023
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

It turns out that I left a bug in
[#11144](#11144), which
introduced a spread operator in record literals. When highlighting
subexpressions that are spread inside records, the spread operator and
the token before it are insert twice. Currently, when you type `{ ...()
}`, this is what you'll see:


![image](https://github.com/nushell/nushell/assets/45539777/9a76647a-6bbe-426e-95bc-50becf2fa537)

With the PR, the behavior is as expected:


![image](https://github.com/nushell/nushell/assets/45539777/36bdab23-3252-4500-8317-51278da0e869)

I'm still not sure how `FlatShape` works, I just copied the existing
logic for flattening key-value pairs in records, so it's possible
there's still issues, but I haven't found any yet (tried spreading
subexpressions, variables, and records).

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

Highlighting for subexpressions spread inside records should no longer
be screwed up.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

Is there any way to test flattening/syntax highlighting?

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
hardfau1t pushed a commit to hardfau1t/nushell that referenced this pull request Dec 14, 2023
Goes towards implementing nushell#10598, which asks for a spread operator in
lists, in records, and when calling commands (continuation of nushell#11006,
which only implements it in lists)

# Description
This PR is for adding a spread operator that can be used when building
records. Additional functionality can be added later.

Changes:

- Previously, the `Expr::Record` variant held `(Expression, Expression)`
pairs. It now holds instances of an enum `RecordItem` (the name isn't
amazing) that allows either a key-value mapping or a spread operator.
- `...` will be treated as the spread operator when it appears before
`$`, `{`, or `(` inside records (no whitespace allowed in between) (not
implemented yet)
- The error message for duplicate columns now includes the column name
itself, because if two spread records are involved in such an error, you
can't tell which field was duplicated from the spans alone

`...` will still be treated as a normal string outside records, and even
in records, it is not treated as a spread operator when not followed
immediately by a `$`, `{`, or `(`.

# User-Facing Changes
Users will be able to use `...` when building records.

```
> let rec = { x: 1, ...{ a: 2 } }
> $rec
╭───┬───╮
│ x │ 1 │
│ a │ 2 │
╰───┴───╯
> { foo: bar, ...$rec, baz: blah }
╭─────┬──────╮
│ foo │ bar  │
│ x   │ 1    │
│ a   │ 2    │
│ baz │ blah │
╰─────┴──────╯
```
If you want to update a field of a record, you'll have to use `merge`
instead:
```
> { ...$rec, x: 5 }
Error: nu::shell::column_defined_twice

  × Record field or table column used twice: x
   ╭─[entry nushell#2:1:1]
 1 │  { ...$rec, x: 5 }
   ·       ──┬─  ┬
   ·         │   ╰── field redefined here
   ·         ╰── field first defined here
   ╰────
> $rec | merge { x: 5 }
╭───┬───╮
│ x │ 5 │
│ a │ 2 │
╰───┴───╯
```

# Tests + Formatting

# After Submitting
hardfau1t pushed a commit to hardfau1t/nushell that referenced this pull request Dec 14, 2023
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

It turns out that I left a bug in
[nushell#11144](nushell#11144), which
introduced a spread operator in record literals. When highlighting
subexpressions that are spread inside records, the spread operator and
the token before it are insert twice. Currently, when you type `{ ...()
}`, this is what you'll see:


![image](https://github.com/nushell/nushell/assets/45539777/9a76647a-6bbe-426e-95bc-50becf2fa537)

With the PR, the behavior is as expected:


![image](https://github.com/nushell/nushell/assets/45539777/36bdab23-3252-4500-8317-51278da0e869)

I'm still not sure how `FlatShape` works, I just copied the existing
logic for flattening key-value pairs in records, so it's possible
there's still issues, but I haven't found any yet (tried spreading
subexpressions, variables, and records).

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

Highlighting for subexpressions spread inside records should no longer
be screwed up.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

Is there any way to test flattening/syntax highlighting?

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
dmatos2012 pushed a commit to dmatos2012/nushell that referenced this pull request Feb 20, 2024
Goes towards implementing nushell#10598, which asks for a spread operator in
lists, in records, and when calling commands (continuation of nushell#11006,
which only implements it in lists)

# Description
This PR is for adding a spread operator that can be used when building
records. Additional functionality can be added later.

Changes:

- Previously, the `Expr::Record` variant held `(Expression, Expression)`
pairs. It now holds instances of an enum `RecordItem` (the name isn't
amazing) that allows either a key-value mapping or a spread operator.
- `...` will be treated as the spread operator when it appears before
`$`, `{`, or `(` inside records (no whitespace allowed in between) (not
implemented yet)
- The error message for duplicate columns now includes the column name
itself, because if two spread records are involved in such an error, you
can't tell which field was duplicated from the spans alone

`...` will still be treated as a normal string outside records, and even
in records, it is not treated as a spread operator when not followed
immediately by a `$`, `{`, or `(`.

# User-Facing Changes
Users will be able to use `...` when building records.

```
> let rec = { x: 1, ...{ a: 2 } }
> $rec
╭───┬───╮
│ x │ 1 │
│ a │ 2 │
╰───┴───╯
> { foo: bar, ...$rec, baz: blah }
╭─────┬──────╮
│ foo │ bar  │
│ x   │ 1    │
│ a   │ 2    │
│ baz │ blah │
╰─────┴──────╯
```
If you want to update a field of a record, you'll have to use `merge`
instead:
```
> { ...$rec, x: 5 }
Error: nu::shell::column_defined_twice

  × Record field or table column used twice: x
   ╭─[entry nushell#2:1:1]
 1 │  { ...$rec, x: 5 }
   ·       ──┬─  ┬
   ·         │   ╰── field redefined here
   ·         ╰── field first defined here
   ╰────
> $rec | merge { x: 5 }
╭───┬───╮
│ x │ 5 │
│ a │ 2 │
╰───┴───╯
```

# Tests + Formatting

# After Submitting
dmatos2012 pushed a commit to dmatos2012/nushell that referenced this pull request Feb 20, 2024
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

It turns out that I left a bug in
[nushell#11144](nushell#11144), which
introduced a spread operator in record literals. When highlighting
subexpressions that are spread inside records, the spread operator and
the token before it are insert twice. Currently, when you type `{ ...()
}`, this is what you'll see:


![image](https://github.com/nushell/nushell/assets/45539777/9a76647a-6bbe-426e-95bc-50becf2fa537)

With the PR, the behavior is as expected:


![image](https://github.com/nushell/nushell/assets/45539777/36bdab23-3252-4500-8317-51278da0e869)

I'm still not sure how `FlatShape` works, I just copied the existing
logic for flattening key-value pairs in records, so it's possible
there's still issues, but I haven't found any yet (tried spreading
subexpressions, variables, and records).

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

Highlighting for subexpressions spread inside records should no longer
be screwed up.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

Is there any way to test flattening/syntax highlighting?

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:breaking-change This PR implies a change affecting users and has to be noted in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants