Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ast)!: generate ast_builder.rs. #3890

Merged
merged 1 commit into from
Jul 9, 2024

Conversation

rzvxa
Copy link
Collaborator

@rzvxa rzvxa commented Jun 24, 2024

Every structure has 2 builder methods:

  1. xxx e.g. block_statement
    #[inline]
    pub fn block_statement(self, span: Span, body: Vec<'a, Statement<'a>>) -> BlockStatement<'a> {
        BlockStatement { span, body, scope_id: Default::default() }
    }
  1. alloc_xxx e.g. alloc_block_statement
    #[inline]
    pub fn alloc_block_statement(
        self,
        span: Span,
        body: Vec<'a, Statement<'a>>,
    ) -> Box<'a, BlockStatement<'a>> {
        self.block_statement(span, body).into_in(self.allocator)
    }

We generate 3 types of methods for enums:

  1. yyy_xxx e.g. statement_block
    #[inline]
    pub fn statement_block(self, span: Span, body: Vec<'a, Statement<'a>>) -> Statement<'a> {
        Statement::BlockStatement(self.alloc(self.block_statement(span, body)))
    }
  1. yyy_from_xxx e.g. statement_from_block
    #[inline]
    pub fn statement_from_block<T>(self, inner: T) -> Statement<'a>
    where
        T: IntoIn<'a, Box<'a, BlockStatement<'a>>>,
    {
        Statement::BlockStatement(inner.into_in(self.allocator))
    }
  1. yyy_xxx where xxx is inherited e.g. statement_declaration
    #[inline]
    pub fn statement_declaration(self, inner: Declaration<'a>) -> Statement<'a> {
        Statement::from(inner)
    }

Generic parameters:

We no longer accept Box<'a, ADT>, Atom or &'a str, Instead we use IntoIn<'a, Box<'a, ADT>>, IntoIn<'a, Atom<'a>> and IntoIn<'a, &'a str> respectively.
It allows us to rewrite things like this:

let ident = IdentifierReference::new(SPAN, Atom::from("require"));
let number_literal_expr = self.ast.expression_numeric_literal(
    right_expr.span(),
    num,
    raw,
    self.ast.new_str(num.to_string().as_str()),
    NumberBase::Decimal,
);

As this:

let ident = IdentifierReference::new(SPAN, "require");
let number_literal_expr = self.ast.expression_numeric_literal(
    right_expr.span(),
    num,
    raw,
    num.to_string(),
    NumberBase::Decimal,
);

Copy link

graphite-app bot commented Jun 24, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@overlookmotel
Copy link
Collaborator

overlookmotel commented Jun 24, 2024

This one is trickier, because all kinds of weird methods have been added to AST builder over time, and I don't think there's much consistency in:

  • Which methods are implemented and which are not.
  • What params methods take - some receive T, others Box<T>.
  • What methods return - T, Box<T>, or Expression / Statement.
  • Naming of methods.

I think we might need to standardize the methods in the existing manually-written impl first, before moving over to a codegen-ed version.

There's also a danger of generating lots more code than we have at present, and hurting compile times, if we try to make the codegen-ed version cover all possible cases. If the amount of code generated is excessive, maybe we need attrs on the AST types which indicate which builders to generate?

#[visited_node]
#[builder(Expression)]
struct BooleanLiteral {
    #[cfg_attr(feature = "serialize", serde(flatten))]
    pub span: Span,
    pub value: bool,
}

This would generate:

impl<'a> AstBuilder<'a> {
    // Always generated by default
    fn boolean_literal(self, span: Span, value: bool) -> BooleanLiteral {
        BooleanLiteral { span, value }
    }

    // Generated because of `#[builder(Expression)]` attr
    fn boolean_literal_expression(self, span: Span, value: bool) -> Expression<'a> {
        Expression::BooleanLiteral(self.alloc(BooleanLiteral { span, value }))
    }
}

Other methods that we could have generated (but don't want to):

impl<'a> AstBuilder<'a> {
    fn boolean_literal_array_expression_element(self, span: Span, value: bool) -> ArrayExpressionElement<'a> {
        ArrayExpressionElement::BooleanLiteral(self.alloc(BooleanLiteral { span, value }))
    }

    fn boolean_literal_property_key(self, span: Span, value: bool) -> PropertyKey<'a> {
        PropertyKey::BooleanLiteral(self.alloc(BooleanLiteral { span, value }))
    }

    fn boolean_literal_argument(self, span: Span, value: bool) -> Argument<'a> {
        Argument::BooleanLiteral(self.alloc(BooleanLiteral { span, value }))
    }
}

@overlookmotel
Copy link
Collaborator

overlookmotel commented Jun 25, 2024

Probably what I suggested above is too complicated. Maybe better to tackle this incrementally?

e.g. Start by replacing all the methods for Expressions and Statements as 1 PR. Then continue whittling away the easy/standard ones (e.g. Declarations) in further PRs, until most of manual impls file is cleared out, and then it's easier to see what oddities remain.

For all Expressions, we could generate:

impl BooleanLiteral {
    #[inline]
    fn new(span: Span, value: bool) -> BooleanLiteral {
        BooleanLiteral { span, value }
    }

    #[inline]
    fn new_expression<'a>(span: Span, value: bool, allocator: &'a Allocator) -> Expression<'a> {
        Expression::BooleanLiteral(
            Box::new_in(BooleanLiteral::new(span, value), allocator)
        )
    }
}

impl<'a> AstBuilder<'a> {
    #[inline]
    fn boolean_literal(self, span: Span, value: bool) -> Expression<'a> {
        BooleanLiteral::new_expression(span, value, self.allocator)
    }
}

If consumer wants an Argument::BooleanLiteral, they can convert themselves with Argument::from(ast.boolean_literal(span, value)).

As I see it, the main value of AstBuilder is to hold the allocator and so provide the nicer API self.ast.boolean_literal(span, value) rather than BooleanLiteral::new_expression(span, value, self.allocator). So any function which doesn't require the allocator doesn't need to be in AstBuilder.

@rzvxa rzvxa force-pushed the 06-24-feat_ast_codegen_generate_ast_kind.rs_ branch from d82dab5 to 0f6e917 Compare June 25, 2024 08:07
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 6fc357d to e5b612c Compare June 25, 2024 08:07
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from e5b612c to 450ccb9 Compare June 25, 2024 08:13
@rzvxa
Copy link
Collaborator Author

rzvxa commented Jun 25, 2024

@overlookmotel let me mull this one for a little while. I'll come back when I have figured out a working plan to discuss.

@rzvxa rzvxa force-pushed the 06-24-feat_ast_codegen_generate_ast_kind.rs_ branch from 3d6ab82 to abd819e Compare June 25, 2024 09:02
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 450ccb9 to 9ceae0a Compare June 25, 2024 09:02
@rzvxa rzvxa force-pushed the 06-24-feat_ast_codegen_generate_ast_kind.rs_ branch from abd819e to 0755bdf Compare June 25, 2024 09:31
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 9ceae0a to 707fd3d Compare June 25, 2024 09:31
@rzvxa rzvxa marked this pull request as ready for review June 25, 2024 09:36
@rzvxa rzvxa marked this pull request as draft June 25, 2024 09:37
@Boshen Boshen force-pushed the 06-24-feat_ast_codegen_generate_ast_kind.rs_ branch from 0755bdf to 86ae683 Compare June 25, 2024 09:49
@Boshen Boshen force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 707fd3d to 4aeb8fa Compare June 25, 2024 09:50
@rzvxa rzvxa force-pushed the 06-24-feat_ast_codegen_generate_ast_kind.rs_ branch from 86ae683 to 8f78662 Compare June 25, 2024 10:36
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 4aeb8fa to b5cce9d Compare June 25, 2024 10:37
Copy link

codspeed-hq bot commented Jun 25, 2024

CodSpeed Performance Report

Merging #3890 will not alter performance

Comparing 06-25-feat_ast_codegen_generate_ast_builder.rs_ (d347aed) with main (91c792a)

Summary

✅ 29 untouched benchmarks

@rzvxa
Copy link
Collaborator Author

rzvxa commented Jun 25, 2024

Related to backlog#29

Right now I can think of 3 options to unify all of our builder calls(both in terms of name and signature),

1. Let the user do some of the work

Have a bunch of enum_variant methods like this:

fn statement_block_statement(self, it: BlockStatement<'a>) -> Statement<'a> {
    Statement::BlockStatement(self.alloc(it))
}

fn block_statement(
    self,
    span: Span,
    body: Vec<'a, Statement<'a>>,
) -> BlockStatement<'a> {
    BlockStatement {
        span,
        body,
        scope_id: Default::default(),
    }
}

then we use it like this:

let block = self.ast.block_statement(span, body);
let stmt = self.ast.statement_block_statement(block);

2. More methods

We have methods like these:

fn statement_block_statement(
    self,
    span: Span,
    body: Vec<'a, Statement<'a>>,
) -> Statement<'a> {
    Statement::BlockStatement(self.alloc(self.block_statement(span, body)))
}

fn block_statement(
    self,
    span: Span,
    body: Vec<'a, Statement<'a>>,
) -> BlockStatement<'a> {
    BlockStatement {
        span,
        body,
        scope_id: Default::default(),
    }
}

and use like this:

let stmt = self.ast.statement_block_statement(span, body);

This is nice but it would mean that if you want to wrap an already existing block in a Statement enum you have to do something like this:

let stmt = Statement::BlockStatement(self.alloc(block));

Or rename the method we previously called statement_block_statement to new_statement_block_statement; and generate more functions to use like this(similar to 1):

fn statement_block_statement(self, it: BlockExpression<'a>) -> Expression<'a> {
    Statement::BlockStatement(self.alloc(it))
}

3. Using From trait

We generate these methods:

fn block_statement(
    self,
    span: Span,
    body: Vec<'a, Statement<'a>>,
) -> Box<'a, BlockStatement<'a>> {
    self.alloc(BlockStatement {
        span,
        body,
        scope_id: Default::default(),
    })
}

we also generate From implementations for variant initialization.

If we want to initialize a try we write something like this:

let try = self.ast.try_statement(
    span,
    self.ast.block_statement(block_span, block_body)
);

If we want a Statement from our block we just do this:

let stmt = self.ast.block_statement(span, body).into();

TL;DR

So to sum up here all 3 approaches that I can think of together:

// 1
let block = self.ast.block_statement(span, body);
let block_stmt = self.ast.statement_block_statement(block);
// 2
let block_stmt = self.ast.statement_block_statement(span, body);
// 3
let block_stmt = self.ast.block_statement(span, body).into();


// 1
let try_stmt = self.ast.try_statement(
    span,
    self.ast.alloc(self.ast.block_statement(block_span, block_body))
    handler,
    finalizer,
)
// 2
let try_stmt = self.ast.try_statement(
    span,
    self.ast.alloc(self.ast.block_statement(block_span, block_body))
    handler,
    finalizer,
)
// 3
let try_stmt = self.ast.try_statement(
    span,
    self.ast.block_statement(block_span, block_body).into()
    handler,
    finalizer,
)

I personally like option 3 more than the other 2 but I'm open to any objections.

@Boshen @overlookmotel I would like to know your thoughts on this, And what are your suggestions.

@rzvxa
Copy link
Collaborator Author

rzvxa commented Jun 25, 2024

There is actually a fourth and fifth option which @overlookmotel mentioned these are the 3 that I can think of. So don't forget to read the previous comments.

@rzvxa rzvxa force-pushed the 06-24-feat_ast_codegen_generate_ast_kind.rs_ branch from 8f78662 to 1e31594 Compare June 25, 2024 14:33
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 921bb97 to 36a5ef7 Compare June 25, 2024 14:33
@rzvxa rzvxa force-pushed the 07-08-feat_ast_codegen_add_ast_builder_generator branch from d68c7ab to 0e33504 Compare July 8, 2024 15:20
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from b98e220 to 62bae25 Compare July 8, 2024 15:21
@Boshen Boshen force-pushed the 07-08-feat_ast_codegen_add_ast_builder_generator branch from 0e33504 to d780b85 Compare July 8, 2024 15:37
@Boshen Boshen force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 62bae25 to 266b292 Compare July 8, 2024 15:38
@Boshen
Copy link
Member

Boshen commented Jul 8, 2024

We can merge once @overlookmotel finishes the review, along with a branch in a rolldown fork with oxc pointing to this branch.

@rzvxa
Copy link
Collaborator Author

rzvxa commented Jul 8, 2024

We can merge once @overlookmotel finishes the review, along with a branch in a rolldown fork with oxc pointing to this branch.

It is based on #4124, Give it a look and let me know if there is anything I should consider changing. I'm mainly not sure about the snapshot changes, I know the pure comments are a valid change(#4119) but for some reason, the snapshot path differs on my system.
https://github.com/rzvxa/rolldown/tree/migrate-to-new-builder

@rzvxa rzvxa force-pushed the 07-08-feat_ast_codegen_add_ast_builder_generator branch from 77a3542 to 30f1b17 Compare July 8, 2024 22:28
@rzvxa rzvxa force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 97f265b to 8d65e15 Compare July 8, 2024 22:28
Copy link
Collaborator

@overlookmotel overlookmotel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have some thoughts/potential changes to propose, but given the size of the PR, let's get it merged ASAP, and then iterate.

@overlookmotel overlookmotel force-pushed the 07-08-feat_ast_codegen_add_ast_builder_generator branch from 30f1b17 to 91c792a Compare July 9, 2024 00:47
Copy link

graphite-app bot commented Jul 9, 2024

Merge activity

  • Jul 8, 8:47 PM EDT: The merge label 'merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Jul 8, 8:57 PM EDT: overlookmotel added this pull request to the Graphite merge queue.
  • Jul 8, 9:08 PM EDT: overlookmotel merged this pull request with the Graphite merge queue.

@overlookmotel overlookmotel changed the base branch from 07-08-feat_ast_codegen_add_ast_builder_generator to main July 9, 2024 00:51
@overlookmotel overlookmotel force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from 8d65e15 to b1615c8 Compare July 9, 2024 00:52
### Every structure has 2 builder methods:

1. `xxx` e.g. `block_statement`
```rust
    #[inline]
    pub fn block_statement(self, span: Span, body: Vec<'a, Statement<'a>>) -> BlockStatement<'a> {
        BlockStatement { span, body, scope_id: Default::default() }
    }
```
2. `alloc_xxx` e.g. `alloc_block_statement`
```rust
    #[inline]
    pub fn alloc_block_statement(
        self,
        span: Span,
        body: Vec<'a, Statement<'a>>,
    ) -> Box<'a, BlockStatement<'a>> {
        self.block_statement(span, body).into_in(self.allocator)
    }
```

### We generate 3 types of methods for enums:

1. `yyy_xxx` e.g. `statement_block`
```rust
    #[inline]
    pub fn statement_block(self, span: Span, body: Vec<'a, Statement<'a>>) -> Statement<'a> {
        Statement::BlockStatement(self.alloc(self.block_statement(span, body)))
    }
```
2. `yyy_from_xxx` e.g. `statement_from_block`
```rust
    #[inline]
    pub fn statement_from_block<T>(self, inner: T) -> Statement<'a>
    where
        T: IntoIn<'a, Box<'a, BlockStatement<'a>>>,
    {
        Statement::BlockStatement(inner.into_in(self.allocator))
    }
```
3. `yyy_xxx` where `xxx` is inherited e.g. `statement_declaration`
```rust
    #[inline]
    pub fn statement_declaration(self, inner: Declaration<'a>) -> Statement<'a> {
        Statement::from(inner)
    }
```

------------

### Generic parameters:

We no longer accept `Box<'a, ADT>`, `Atom` or `&'a str`, Instead we use `IntoIn<'a, Box<'a, ADT>>`, `IntoIn<'a, Atom<'a>>` and `IntoIn<'a, &'a str>` respectively.
It allows us to rewrite things like this:
```rust
let ident = IdentifierReference::new(SPAN, Atom::from("require"));
let number_literal_expr = self.ast.expression_numeric_literal(
    right_expr.span(),
    num,
    raw,
    self.ast.new_str(num.to_string().as_str()),
    NumberBase::Decimal,
);
```
As this:
```rust
let ident = IdentifierReference::new(SPAN, "require");
let number_literal_expr = self.ast.expression_numeric_literal(
    right_expr.span(),
    num,
    raw,
    num.to_string(),
    NumberBase::Decimal,
);
```
@overlookmotel overlookmotel force-pushed the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch from b1615c8 to d347aed Compare July 9, 2024 00:57
@graphite-app graphite-app bot merged commit d347aed into main Jul 9, 2024
26 checks passed
@graphite-app graphite-app bot deleted the 06-25-feat_ast_codegen_generate_ast_builder.rs_ branch July 9, 2024 01:08
@github-actions github-actions bot mentioned this pull request Jul 9, 2024
Boshen added a commit that referenced this pull request Jul 9, 2024
## [0.18.0] - 2024-07-09

- d347aed ast: [**BREAKING**] Generate `ast_builder.rs`. (#3890) (rzvxa)

### Features

- c6c16a5 minifier: Dce all conditional expressions (#4135) (Boshen)
- 365d9ba oxc_codegen: Generate annotation comments before
`CallExpression` and `NewExpression` (#4119) (IWANABETHATGUY)
- 3a0f2aa parser: Check for illegal modifiers in modules and namespaces
(#4126) (DonIsaac)
- 2f53bdf semantic: Check for abstract ClassElements in non-abstract
classes (#4127) (DonIsaac)
- c4ee9f8 semantic: Check for abstract initializations and
implementations (#4125) (Don Isaac)
- 44c7fe3 span: Add various implementations of `FromIn` for `Atom`.
(#4090) (rzvxa)

### Bug Fixes

- cb1af04 isolated-declarations: Remove the `async` and `generator`
keywords from `MethodDefinition` (#4130) (Dunqing)

Co-authored-by: Boshen <[email protected]>
github-merge-queue bot pushed a commit to rolldown/rolldown that referenced this pull request Jul 9, 2024
You can read about the new builder methods here:
oxc-project/oxc#3890

We also provide the `IntoIn` and `FromIn` in the `oxc_allocator` crate
so I've removed the rolldown helper's version.

I didn't touch the snippets but now some of them can painlessly be
replaced with builder methods since we accept generic arguments for
things that can be allocated on the arena. So now you can pass `&str`
where `Atom` or `&'ast str` is expected. Or pass in a stack ast type and
have it boxed in the builder using `IntoIn` bounds.
github-merge-queue bot pushed a commit to rolldown/rolldown that referenced this pull request Jul 9, 2024
You can read about the new builder methods here:
oxc-project/oxc#3890

We also provide the `IntoIn` and `FromIn` in the `oxc_allocator` crate
so I've removed the rolldown helper's version.

I didn't touch the snippets but now some of them can painlessly be
replaced with builder methods since we accept generic arguments for
things that can be allocated on the arena. So now you can pass `&str`
where `Atom` or `&'ast str` is expected. Or pass in a stack ast type and
have it boxed in the builder using `IntoIn` bounds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ast Area - AST A-isolated-declarations Isolated Declarations A-minifier Area - Minifier A-parser Area - Parser A-transformer Area - Transformer / Transpiler merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants