Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce dependencies size #5

Closed
4 tasks done
korgan00 opened this issue May 19, 2023 · 5 comments
Closed
4 tasks done

Reduce dependencies size #5

korgan00 opened this issue May 19, 2023 · 5 comments
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on

Comments

@korgan00
Copy link

Initial checklist

Problem

I am working on a project using unified, remark, and rehype and trying to improve deploying times I realized that the size of this dependency is huge because of parse5.
A let you part of the installation sizes.
Screenshot 2023-05-18 at 11 06 56 AM

Solution

Don't use parse5; try to replace it with another solution or use the unified system to parse.

Alternatives

A custom parser.

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels May 19, 2023
@wooorm
Copy link
Member

wooorm commented May 19, 2023

(this thread is a continuation from syntax-tree/hast-util-to-html#38 (comment))

Hi again. That’s impossible. We need to parse HTML. So we need an HTML parser.

Please ask a question. Don’t ask a solution. This solution doesn’t work. See https://github.com/remarkjs/.github/blob/main/support.md. Spend time on your question. Explain what tools you have. Why do you have remark and parse5? Explain explain explain.

@korgan00
Copy link
Author

I don't have a question. I opened a feature request, in this case, change the way it is parsed.

I don't have parse5 on purpose. I have parse5 because of this dependency.

Since this is a unified plugin/extension (that is usually used to parse and convert) I don't expect to have another parser out of the unified stack as a sub-dependency.

I think it would be nice to have a custom parser inside the unified stack instead of using external parsers, and probably it would be more performant.

@wooorm
Copy link
Member

wooorm commented May 19, 2023

It takes a full time job, I think at least 1 year but perhaps longer, to create a new WHATWG compliant parser. Feel free to pay me, say, $120k or so and I’ll make it, it’ll be fun to learn. But I doubt it will be much better than parse5. I think I could make it a bit smaller sure but that’s a high price.

Next to creating a new parser, it would also have to be maintained. Most of the work in unified is towards maintaining parsers. I already maintain 700+ packages, I don’t think I can maintain another parser that requires so much time to maintain it.

I have parse5 because of this dependency.

Why? Why do you have hast-util-from-html as a dependency? Why do you have rehype-parse as a dependency if your input is markdown? That’s what I’m asking. You probably don’t.

@ChristianMurphy
Copy link
Member

ChristianMurphy commented May 19, 2023

Hey @korgan00!
Thanks for reaching out.

Parsers generally face an iron triangle of constraints: speed, correctness, and size.
Parse5 optimizes for correctness, speed, and size in that order.
Different parsers make different trade offs, you are welcome to bridge another parser to generate HAST.

It sounds like you are looking for an out of the box solution that requires less download.
Consider leveraging https://github.com/syntax-tree/hast-util-from-html-isomorphic which uses the built in browser APIs to parse when possible (these are actually even larger than parse5, but are baked into browsers and some server runtimes, so they don't need to be downloaded).

If you do specifically need parsing in an environment where there is no built in parser, then you have some options:

  1. consider enhancing parse5, through contributing code or financial support, to further enhance their speed and bundle size
  2. consider bridging another existing HTML parser to generate HAST
  3. consider sponsoring the creation of a new HTML parser. Though know that that is a large effort, and the time, finances, and resources to push it forward would be significant.

TL;DR
The issue trackers in unified and syntax-tree are for actionable features and bugs.
"change the way it is parsed" is not useful or actionable.
As such this ticket will be closed.

That said, I hear and appreciate that you have constraints around code size.
We have a discussion forum https://github.com/orgs/syntax-tree/discussions/new/choose
Folks are happy to talk though issues and help out.

Focus the discussion on what your actual project constraints.
"I need a smaller bundle" is not a constraint.
A constraint could look something like:

  • I need to run the parser on an embedded device with less than 1Gb of RAM
  • I have users running on 3G network and need the page to be responsive within 10 seconds
  • etc

With the actual constraint which is driving this request.

The support guide https://github.com/syntax-tree/.github/blob/main/support.md offers further suggestions on how to frame the question so others can help you.

@ChristianMurphy ChristianMurphy closed this as not planned Won't fix, can't repro, duplicate, stale May 19, 2023
@ChristianMurphy ChristianMurphy added the 🙅 no/wontfix This is not (enough of) an issue for this project label May 19, 2023
@github-actions

This comment has been minimized.

@github-actions github-actions bot added 👎 phase/no Post cannot or will not be acted on and removed 🤞 phase/open Post is being triaged manually labels May 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on
Development

No branches or pull requests

3 participants