-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support .noderc or similar file-based initialization configurations? #53787
Comments
I’ve proposed this in the past, such as in #49148 (comment) and https://github.com/orgs/nodejs/discussions/44975#discussioncomment-3868855. The short version is that I think we need a config file, such as The recent So yes, I do think we need a config file, and it should support all of Node’s configuration (or at least as much as is in {
"options": {
// Keys can be any flag available to NODE_OPTIONS, camelcased
"import": "tsx",
"experimentalRequireModule": true
}
} Or the same within a Another consideration is conditions; should a single file support multiple values based on condition, so you could do something like |
+1, but I am very much against JSON for the format, mainly because it doesn't support comments. |
+1 here as well. And I agree with @targos. |
It's important that the format be easily editable by other tools, such as those registering hooks. Anything other than comment-less JSON means that those tools need a dependency, unless we create a new builtin for the format. We've had |
That was my idea. Have a new builtin format. |
If we want to be aligned with npmrc, https://www.npmjs.com/package/ini can be an option too (there are other native INI parsers too, like https://github.com/benhoyt/inih)
I think we should consider supporting a subset and evaluate them on a case by case basis. NODE_OPTIONS are about “flat” command line options, some of them per-process, some of them per-isolate, some of them per-environment. I think for the file based config we need something better than a flat structure to avoid the NODE_OPTIONS and execArgv inheritance validity problem again, and it will take time to address the hierarchical problem for all the possible configs (I also think the current internal classification of options may still contain quite some errors to be surfaced to the config directly) |
I am in favor of a standalone file with the following opinions:
|
I'm a fan of KDL. I feel like the need for a config file format parser if we go with non-json (which we should because comments) suggests it might be of value to also have something built in for generalized parsing, which could also be helpful for stream processing in many cases. I wonder if we should take that into consideration when building whatever format parser we might need. 🤔 |
Whatever format we read needs to be parseable in C++. We added simdjson to be able to parse JSON in C++, so that’s an option. The addition of Another option is to expand out # node.config.env
NODE_OPTION_IMPORT=tsx
NODE_OPTION_EXPERIMENTAL_REQUIRE_MODULE=true And it gets loaded via |
I think making this opt-in would bring us back to square one, especially if the surface is limited to a flat list, then it’s not really too different than .env; but that’s also inconsistent with the broader software convention of loading a rc file by default. |
I think this is actually a flaw, not a benefit, because a flat list of environment variables is not a great way for inherited configurations, see #41103 - child workers and contexts need granular control of configurations. |
I was thinking that whatever config file we choose would be loaded automatically as a semver-major change, and a flag could specify it for older Node versions. A flag might be used for the latest version too in case the user wants to specify a file that’s not the default filename or location. I’m not sure what about #41103 involves any of this. We should just filter out options that don’t make sense to inherit, both currently to resolve that issue and in the future when we add a proper config file. We should do that filtering however the problematic options are defined, such as via environment variables or |
We all hate json. No comments, excessive quoting, no multi line strings, no trailing commas, etc. But:
A The beauty of json is that no one has to agree on very much. |
I am leaning towards "a field specified in package.json pointing to a file" too primarily because to load this by default, it adds overhead to the startup to probe the file system; And since we already probe the file system for package.json, the overhead of probing another field in it would be the smallest. But I do like to see a fixed (at least base) name of the file, because I like the concept of universally recognizable manifest of projects, just like when I check out a JS project/package, if I want to see dependency information I go to package.json, if I want to see TypeScript information I go to tsconfig.json, if I want to see their eslint configs I go to any file that contains the word eslint etc. Maybe something we can do is to support multiple formats (like what eslint does), and JSON would be the first to support, but they all have the same basename, and people just pick the format they like, provided that we do add support for other formats later on. |
The word "user" is unclear here. The core of #41103 is that the end user spawning the process and the library spawning the workers can be controlled by different parties. What happens when the end user wants to apply a configuration to all the child workers (spawned by third party), but not in the main thread running their own code? Do they go over all the worker spawning code in node_modules and update the code in an ad-hoc way? What if they need to persist this setup? I don't think the use of environment variables or command line flags are optimal for threads in the first place - you don't get system equivalent of these, because the concept of environment variables or command line flags are already per-process - like if you do setenv() and getenv() on Linux, the effect is already per-process. The non-standard copies Node.js adds to the worker options are artificial and bound to be somewhat awkward. They are a good enough abstraction when we just want to provide an API with values readily available in the main thread via process.execArgv and process.env. But we are designing an external file-based configuration here, which has not existed for the main thread anyway, so I don't think we should limit the thinking in a model that's already forced. I also think it maybe worth supporting package-scoped runtime configurations, and in that case another flat configuration that can only be applied globally wouldn't work great either. In Node.js, chlid workers can form trees, and packages can form trees, so we need to integrate that tree structure into the configuration design, as one-dimensional lists are bound to be awkward to apply on a two-dimensional tree. |
Using a field in |
That’s the approach we took with https://github.com/pkgjs/support - some advantages of using a separate file for the “meat” include granular codeowners control, and keeping weight out of the published packument (if it’s a published package). |
How could this feature be used successfully across different package.json scripts? Especially regarding loaders - most of my packages and applications today have different loader requirements during linting, testing, and when running in production. (Tests are often the only time that I have a TypeScript loader registered, for instance.) Since Node.js - in contrast to something like ESLint - can serve multiple purposes within a single project, only being able to supply a single config that has to cater to all of them - without causing side effects - seems problematic at best. (Unless I missed something and this was somehow already addressed in the design ideas proposed above.) |
I think we would ship a flag, like "scripts": {
"lint": "node --config=node.base.config.json --config=node.lint.config.json eslint",
"test": "node --config=node.base.config.json --config=node.test.config.json test.js"
} |
That was similar to why I mentioned that we shouldn’t use a flat list for configs. One way to do it is to define a format that allows conditions in the config file. Conditions can include some kind of expressions involving environment variables. In your script fields in package.json, you can set the environment variables that match the desirable conditions in your rc file. Typical script runners should pass down the environment variables to the process launched for the selected script, and if that process is running Node.js (via shebang or something) during startup we locate the config file first, match the environment variables against the conditions in the config file, and apply configurations accordingly.
I think for scripts, Node.js CLI options wouldn’t work very well - typically scripts run executables that don’t necessarily take or pass down Node.js CLI options. |
Actually going back to OP I wonder if we should just support preloaded JavaScript files specified in package.json, because a declaration-based config can always be very limited (that’s why people end up with eslintrc.js). A JavaScript file gives you maximum flexibility to do this kind of condition matching, and to configure the process - we could also consider making the preloaded files applied to child workers by default, and the preloaded file can decide whether it wants to skip the configurafion when it sees it’s not preloaded by the main thread. The bits missing from this would be the ability to specify configs that need to be applied before Node.js is ready to load user JS (say, ICU data). But that subset would be more limited and may be easier to specify in a declaration-based format. |
JavaScript files wouldn't be editable by other tools, for example to register hooks. |
SGTM |
If we go with the simplest APIs in the OP: {
"preload": [
"apm-package/register"
],
"dependencies": {
"apm-package": "1.1.1"
}
} tools can simply update the package.json to add preload scripts. |
Would this preload apply only to |
Although a configuration file is an attractive concept, it could introduce a security risk by allowing arbitrary files to change Node.js behavior. I'm concerned that if this feature is implemented without an opt-in mechanism, it could pose a risk. |
Putting the data format aside, we should talk about resolution of these files. Where would they live relative to the application? Current working directory only? In the |
They would live relative to the package.json file that defines them or at an absolute pas if the reference is absolute. |
In my opinion, it is best for configuration files to be static text files incapable of introducing logic. Logic in configuration introduces complexity that can be very difficult to keep track of. |
I don’t think it needs to be a flat list. We just need to design the config file such that the design is flexible enough to support all of the options we currently support via flags. It could be a flat list, like
I would go with option 2, but either is fine. |
Some notes about this issue from the loaders call yesterday (was just me an @marco-ippolito )
|
I think we can address this with the version field, worst case we add a new version that is better suited for the use cases we discover after the first iteration gets out. I expect us to develop this for a few iterations before reaching stability and settle down on a last stable version, and maybe as the last stable version really gets adopted, we drop support for older versions. (It may not be that much complex to maintain multiple versions though, if the general direction is making it more flexible - I expect what might happen is that we convert version N and version N+1 to the same internal representation, while N typically has fewer fields or a simplified structure compared to N+1). |
I think with versioning its "easy" to get started because its possible to migrate to a newer config without too much disruption, and versions can go through a deprecation cycle. |
Going with "INI files" is basically "let's invent a new config format," because "INI files" are not standardized in any way what-so-ever. I don't understand the desire to do that. The closest thing to an INI standardized format is TOML, which isn't really designed to be INI but very simple configurations can look like INI (https://github.com/toml-lang/toml#comparison-with-other-formats). My vote is against "INI files." Being able to reference documentation about the configuration format is vital. |
I think the idea is to get inspiration from .ini files of other languages, not use .ini files, but .json |
Maybe I wasn't being clear, we were basically thinking about "a JSON representation of the schema that other software typically conveys using the INI format". So a naive draft can be: {
"version": 1,
"preload": [ "amaro"],
"[test]": {
"env": { "MY_CUSTOM_TEST_ENV": "foo" }
},
"[test.coverage]": {
"env": { "NODE_V8_COVERAGE": "./.coverage" }
},
"[start]": {
"preload": [ "./instrumentation.ts" ]
}
} TBD how to map between sections in the rc file to script fields or bin fields, maybe it can be done via an environment variable, or a cli flag, but that needs to be understood by the script runners/bin runners. |
The clarification makes sense. Thank you. |
After using the native test runner for quite some time, the addition of a config file will greatly improve the experience and IDE integrations Adding a config file will help the test runner with:
I encountered too many times the thought - it would be so much easier with config file |
We'll have more discussions about this proposal again on todays loaders call (nodejs/loaders#222). Just trying to do a brain dump before the call:
|
{
"version": "v0", // version as a string so we can support semver and experimental versions
"node": { // node namespace so user tools can put garbage outside
"startup-flags": { // these flags are allow-listed
"test": true, // equals to --test
"transform-types": true,
"import": "amaro/register"// equals to --import="amaro/register" if multiple then array
},
"test-flags": {
"env": { "MY_CUSTOM_TEST_ENV": "foo" },
"coverage": {
"env": { "NODE_V8_COVERAGE": "./.coverage" }
}
}
}
cli arguments always override config file ones |
Would it still be possible to configure |
I can't because of my day job. Sorry. Happy to provide any feedback via GitHub though. |
I think one way to specify it using sections could be: {
"noderc": "./noderc.json",
"scripts": {
// The script field name -> NODE_RC_SECTION mapping could be handled automatically by newer versions
// of task runners, it's up to the users to decide what/which version of task runners (node --run, npm run, etc.)
// they should pin and whether explicitly mapping it with e.g. cross-env is necessary for their project.
// The environment variable can still serve as a user-facing config for overrides/debugging.
"test": "cross-env NODE_RC_SECTION=test node --test",
"test:coverage": "cross-env NODE_RC_SECTION=test.coverage node --test"
}
} {
"[test]": { // Applied when NODE_RC_SECTION=test
"test": {
"reporter": "tap",
"name-pattern": [ "test [1-3]", "/test [4-5]/i" ]
}
"[coverage]": { // Applied when NODE_RC_SECTION=test.coverage
"test": {
"reporter": "lcov", // overrides the "tap" setting from higher-level section
"reporter-destination": "lcov.info",
"coverage": true,
}
}
}
} Alternatively, we could also hoist the grouping out of one single file and support multiple rc files using an environment variable: {
"scripts": {
// It seems harder / too magical for the task runners to handle the script name -> NODE_RC_FILE value
// mapping automatically. Probably best to require the NODE_RC_FILE config to be explicit.
"test": "cross-env NODE_RC_FILE=./.noderc.test.json node --test",
"test:coverage": "cross-env NODE_RC_FILE=./.noderc.test.coverage.json node --test"
}
} { // ./.noderc.test.json
"test": {
"reporter": "tap",
"name-pattern": [ "test [1-3]", "/test [4-5]/i" ]
}
} { // ./.noderc.test.coverage.json
"extends": [ "./.noderc.test.json" ]
"test": {
"reporter": "lcov", // overrides the "tap" setting from ./.noderc.test.json
"reporter-destination": "lcov.info",
"coverage": true,
}
} |
I like the |
For the script -> rc config mapping, I wonder if we should just allow the mapping in package.json, like this: {
"noderc": {
"*": "./.noderc.json" // This is what "noderc": "./.noderc.json" shorthand implies - maybe it should be "default"?
"test": "./.noderc.test.json", // Task runner are supposed to set up NODE_RC_FILE for the test target, and this is supposed to extend .noderc.json
"test:coverage": "./.noderc.test.coverage.json"
},
"scripts": {
"test": "node --test",
"test:coverage": "node --test"
}
} this needs to be mapped by the task runners, so it will take time for them to roll out the support, meanwhile the |
instead of “default”, perhaps just a |
I feel that it would be less error-prone to name it after something that's very unlikely to be a script target, "*" is probably the least risky one and is the most intuitive IMO, "default" kiiind of make sense (considering there's also "default" in exports conditions), I am not sure how intuitive "true" is - at least I wouldn't be able to understand it intuitively. I am not sure if it's just me but my intuition though would imply "*" would be applied in addition to the file matching the script target name, not instead of, whereas "default" is clearer (given what you'd expect from "default" in export conditions, or in a switch block). |
Another nice thing about having a key for the rc file to be applied is that during the initial iterations, we can make it opt-in (so to try it out, early users need to set something like |
Opened nodejs/loaders#225 to consolidate the ideas so far |
In #52219 it was mentioned that for APM use cases it would be nice to have a way to register loaders without using command line flags. It occurred to me maybe what we need is an initialization config similar to the rc files out there for various applications.
For example in my
.lldbinit
I have these that extend LLDB to help me debugIt seems this is serving the same use case as the Node.js loader registration - run some scripts or register some plugin to Node.js before you actually start running it.
For loading loaders, I guess we can support something like a
.noderc
that is discovered when users start running Node.js, or make that part ofpackage.json
, which allows registering certain hooks before the actual application code is run. For example in the case of package.json, I guess the project pacakge.json would include something like this:If we want something more flexible than "just loading some scripts" though, I guess a dedicated rc file would be easier to use than package.json.
cc @timfish @nodejs/loaders
The text was updated successfully, but these errors were encountered: