-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loader, docs, test: named exports from commonjs modules #16675
Conversation
I know a lot of discussion went into the original decision on how mjs would handle cjs modules but after using the system for a while, and talking with a lot of other people in the community, it just seems like the expected behavior and the wanted behavior is to export named based on the keys. This PR implements that in what is hopefully a performant enough solution, although that shouldn't be too much of a problem since this code only runs during initial module loading. This implementation remains safe with regard to named exports that are also reserved keywords such as `class` or `delete`. Refs: nodejs/node-eps#57
I think I am in favor of the general approach, but cc @nodejs/tsc because if we decide to go with this we’re likely stuck with it.
Can you add a test for that? |
doc/api/esm.md
Outdated
representing the value of `module.exports` at the time they finished evaluating. | ||
When loaded via `import` these modules will provide a `default` export | ||
representing the value of `module.exports` at the time they finished evaluating, | ||
and named exports for each key of `module.exports`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a sentence explaining what will not be supported, i.e. modifying module.exports
with an asynchronous operations will not update that property in ESM. It seems an edge case, but I do not see a way to provide a good error message for it we should be careful in how we document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some nites
assert.deepStrictEqual(Object.keys(fs), ['default']); | ||
assert(fs); | ||
assert(fs.readFile); | ||
assert(readFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you test that readFile
is the same as fs.readFile
?
/cc @guybedford I think this fixes #16474 by moving the execution phase of CJS modules. Could that cause real issues? |
A very very large portion of the js community (babel/typescript) use `__esModule` to define an es module in CJS.
Update: most of these points are no longer relevant when restricting named exports extraction ( I'm glad this can be properly considered in its technical context now. But there are serious concerns here which need to be carefully discussed, so I wouldn't advise rushing this by any means:
Getter triggering and performance as above are probably my biggest concerns here, for which there don't seem to be easy paths forward. |
with my last commit here i changed the implementation a bit (see the commit message) so @mcollina you might want to revoke your approval until you re-review it. to respond to @guybedford i suppose it wouldn't be too difficult to use getOwnPropertyDescriptor although that would raise serious performance concerns for me. this could also be fixed by my latest commit and chalk choosing to not use __esModule, if __esModule doesn't get too much backlash reasoning behind the only bad thing about this is that it might make people get weirded out when then try to go to browser with the expectation that it will work the same way, but i'm hopeful that with good docs and explanation we can not have to worry about this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for heads up @devsnek, yes, there is something I would like to change.
lib/internal/loader/ModuleRequest.js
Outdated
const CJSModule = require('module'); | ||
const pathname = internalURLModule.getPathFromURL(new URL(url)); | ||
const exports = CJSModule._load(pathname); | ||
const es = !!exports.__esModule; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should promote the use of a Symbol
instead. Adding a property might not be feasible for a lot of modules. We should also support __esModule
for backward compat with babel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats a fantastic idea, would attaching it to the module
module, perhaps as Module.esModule
be good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps something in the name to indicate it is a symbol for CommonJS interop cases? esModuleInterop
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with making it as long as we want. I would use Symbol.for()
so it can be backward-compatibile with older versions of node that do not have the symbol, plus other tools that might want to process it.
module.exports
@devsnek nice, __esModule as an opt-in and a symbol seems like it could be a very sensible option here, and chalk doesn't seem to use it currently. It also handles the dealing with the |
recommends @@esmodule over __esModule
lib/internal/loader/ModuleRequest.js
Outdated
@@ -19,6 +19,8 @@ const search = require('internal/loader/search'); | |||
const asyncReadFile = require('util').promisify(require('fs').readFile); | |||
const debug = require('util').debuglog('esm'); | |||
|
|||
const esModuleSymbol = exports.esModuleSymbol = Symbol('esModule'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer Symbol.for('esModuleInterop')
so that an import of 'module'
is not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was my first approach but it failed the tests. i speculate that symbols aren't shared between v8 contexts or something, if you have a solution please lemme know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> vm.runInNewContext('Symbol.for("foo")') === Symbol.for('foo')
true
I think this should work?
lib/internal/loader/ModuleRequest.js
Outdated
const keys = es ? Object.keys(exports) : ['default']; | ||
return createDynamicModule(keys, url, (reflect) => { | ||
if (es) { | ||
for (const key of keys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be some edge cases where module.exports
is a proxy, can you please check if that would cause problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use a for(;;)
loop here. They are faster.
doc/api/esm.md
Outdated
CommonJS modules, when imported, will be handled in one of two ways. By default | ||
they will provide a single `default` export representing the value of | ||
`module.exports` at the time they finish evaluating. However, they may also | ||
provide `@@esModule` or `__esModule` to use named exports, representing each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What’s the motivation for keeping __esModule
when a symbol is available? You can always import { esModule } from 'module';
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, just a thought: What if this was an alternative to module.exports
, e.g. module.esmExports
? We could check for the presence of that property when the script finished.
The upsides of this would be that you don’t need a symbol, and you don’t need to modify the actual exports object
Considering that we are supporting |
lib/internal/loader/ModuleRequest.js
Outdated
const pathname = internalURLModule.getPathFromURL(new URL(url)); | ||
const exports = CJSModule._load(pathname); | ||
const es = Boolean(exports[esModuleSymbol] || exports.__esModule); | ||
const keys = es ? Object.keys(exports) : ['default']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would Object.getOwnPropertyNames(exports)
make more sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i chose to use Object.keys so that it only exports enumerable properties. (it says so in the esm doc)
If I understand correctly, this would currently require the imported module to opt into this mechanism? I don’t think that’s a good idea at all… if you want to opt in, you can already just go with a wrapper If the primary goal is interoperability with the existing ecosystem (and for me it is), then I think this should be on by default for all CJS modules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@addaleax i'm fairly sure that enabling this behaviour by default would be kinda dangerous @mcollina i might have to use a v8 runtime intrinsic to check if something is a proxy (IsJSProxy) if you guys are ok with baking one in bootstrap_node.js, and that might also have issues with chakra, do you guys have a binding to detect that? |
@devsnek Why would it be dangerous? I get that the mapping won’t work 1:1 but really, it should be fine for nearly everything, right? Also, I don’t quite get what the concern with Proxies is. Why do we want different treatment? |
i think the safest thing to do by default at this point is probably just the existing behavior (sad ikr) where it attaches to default because of as for proxies, trying to iterate over them could do interesting things depending on their setup like create an infinite loops |
@devsnek I don’t quite see why it would be safer or not to do that? Maybe it would help if you could explain what “safe” and “dangerous” mean here? And sure, iterating over Proxies could do weird things (and in particular, throw exceptions) – but I wouldn’t say that’s our problem. If somebody writes a buggy Proxy, it’s broken anyway, right? Also: Users are opting into whatever behavior we choose when they make the transition of the importing module to ESM, or writing new code as ESM. If there are problems with importing existing ecosystem code, they can report those as bugs, right? |
Like, I really liked the initial version of this PR. ;) |
i guess if the feature can be disabled |
doc/api/esm.md
Outdated
CommonJS modules, when imported, will be handled in one of two ways. By default | ||
they will provide a single `default` export representing the value of | ||
`module.exports` at the time they finish evaluating. However, they may also | ||
provide `@@esModuleInterop` or `__esModule` to use named exports, representing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps clarify exactly what these properties are - "provide a @esModuleInterop
or __esModule
boolean export indicating that a snapshot should be taken of each enumerable key..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, and is it @@
or @
to doc a symbol, i thought it was two
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's still two, that was a typo.
Without reading everything - how can and should we progress here? |
@BridgeAR I believe @bmeck is presenting to TC39 next week an agenda item to discuss the execution ordering in the specification and also to consider locking down the spec to ensure such a feature like this would not be possible by ensuring execution ordering for other format interops. So the current status of this PR / feature is dependent on how that discussion goes. |
I don't think Node should provide the properties of the In Node, it's common to import modules like this: // foobar.js, a CJS file
module.exports = {foo: 'bar', default: 'baz'};
// another CJS file
var foobar = require('./foobar');
foobar.foo // => 'bar'
foobar.default // => 'baz' Therefore, it's important that // foobar.js, a CJS file
module.exports = {foo: 'bar', default: 'baz'};
// an ESM file
import foobar from './foobar';
foobar.foo // => 'bar'
foobar.default // => 'baz' Node must have agreed, since CJS is already importable in ESM like that. Now, recall that import foobar from './foobar';
import {default as foobar} from './foobar'; We can visualize "the set of things exported by a module" as a map. In the case of foobar.js: {
default: {foo: 'bar', default: 'baz'}
} The set of names in the "module exports map" is a different set of names than the keys of an object which is a Say that I have an ES module like this: export default {foo: 'bar', default: 'baz'};
export var foo = 'qux'; We can visualize that ES module's "module exports map" like this: {
default: {foo: 'bar', default: 'baz'},
foo: 'qux'
} The set of names in this ES module's "module exports map" is Going back to the "module exports map" for foobar.js, which looks like this: {
default: {foo: 'bar', default: 'baz'}
} Given that, when we do this: import {foo} from './foobar'; Then what should happen? The set of names in the CJS module's "module exports map" is Still unsure? Consider this: import foobar, {foo, default as _default} from './foobar'; What should the value of You say that what's expected and wanted is for users to be able to import Users want If users want to pull properties off of a "main" object, then they can just do that: import foobar from './foobar';
var {foo, default: _default} = foobar;
foo // => 'bar'
_default // => 'baz' No special enhancements based-off logical contradictions are required. If a module author wants to publish code with exports other than |
@jacksonrayhamilton hello and thank you for your input. I appreciate you taking the time to voice your opinion, however there are a few points i would like to bring up in regard to your comment. |
@devsnek The depth of my explanation is not to imply that I think the authors don't know what they are talking about. (On the contrary, if I'm wrong about modules, then I think this would be one of the best groups of people to convince me.) The depth is to establish, starkly, the false equivalency which is the crux of my argument. In the discussions I've seen on this feature, and from this PR's description, my impression is that "the need" for the feature is "a given." Therefore, I am taking extra care to unearth the problem I see in this being a premise. People often don't expect and react negatively to challenges to an established premise, so I am utilizing even more fundamental knowledge (the mechanics of modules) to make my claim. I'm curious what you think about the inconsistency I've brought up: Because As for this being an optional feature, I don't think it's good for module authors to be able to opt into this. The first case I'll address is the one where ES module code is being compiled to CJS, which seems like the most common reason for When an author of ES module code compiles ESM to CJS and the compiled CJS code claims to be an ES module, the resulting API is unpleasant for other CJS code requiring the compiled code. Take the most basic case for requiring a module: Allowing module authors to opt into an ES module facade for CJS perpetuates the compilation of CJS-unfriendly APIs. Modules that look like this will live in an export default function foobar () {} And will continue to be compiled into modules that look like this in a 'use strict';
exports.__esModule = true;
function foobar () {}
exports.default = foobar; And If you require that module authors only deliver ES modules normally, you clear the way for authors to compile CJS-friendly APIs alongside the native ES modules for better backwards compatibility. Now, a module like this named export var bar = 'qux';
export default function foo () {} And, compiled alongside it, another file named 'use strict';
var bar = 'qux';
function foo () {}
exports = module.exports = foo;
exports.bar = bar; And The second case I'll address is the one where an author of originally-CJS code wants to make that code available via an ES module API: I'm going to assume that most authors of originally-CJS code are not already manually adding Therefore, these authors have two options:
I think the second option is a much better one for authors to follow and consumers to enjoy. It seems to me this PR would expedite the consumption of today's compiled ES modules, at the expense of the legacy codebases' maintainers for whom we want to make life easier. It looks like a shortcut with drawbacks. In contrast, my ".mjs and .js sibling" strategy requires a subtle paradigm shift in compilation and distribution methodology, and benefits consumers new and old. As a bonus, Node doesn't need to change. |
many libs do |
@bmeck any word on the outcome from your topic @ tc39? |
@devsnek meeting notes are in queue to be released by end of this week. |
@devsnek Notes from TC39 on the topic are out. Reordering is probably not something I would recommend. I'd suggest other routes that don't require reordering be looked at. |
@bmeck are there any other ways to achieve this? i wasn't able to think of any |
@devsnek not using runtime generated values. alternatives all require some form of preprocessing/parsing. my branch attempting this using a pragma is at https://github.com/bmeck/node/tree/named-export-core , but the cost of using proxies slows down all code that uses the pragma enough to make it undesirable. alternatives generally are going to be in a similar vein to mine and use some form of parsing to get the list of names, then some protocol to setup the export syncing (in my case it was through a Proxy, but more runtime performant measures are probable possible w/ source transforms at the cost of compile time). |
i'll just close this as there doesn't seem to be a "safe" way to achieve the wanted behavior 😢. thanks to everyone who joined along for the ride 😄 |
@bmeck do you know if there's a way to see what names are wanted for a particular import specifier?
i could parse the source for the import specifiers separately but i don't want to incur the perf penalty |
@devsnek you can parse the source and see those. However, even with that the record would be idempotent, so given: // a
import {x} from './cjs'; // b
import {y} from './cjs'; You would be stuck with one shape of |
@bmeck i considered that, my current solution is to not put dynamic modules created by the cjs transform in the module map, and use the require cache for idempotence. Obviously the issue them becomes like what if values change with getters or something but then that can be handled by __esModule: I haven't quite figured out what to do with namespace imports/exports |
@devsnek the ECMA262 module map is managed by v8, we can't control it. We only control the loader's module map. |
@bmeck yes i meant the loader's module map, wouldn't that make it rerun the transform for each request of the cjs module |
@devsnek not for various things like: import {foo} from './foo'; // this populates a module record with a binding {foo}
import('./foo').then(ns => {
// ns is the exact same module record as above, unconditionally
}) |
can we just propose a "ReflectiveModule" with getters and such to the v8 team :( |
I know a lot of discussion went into the original decision on how mjs would handle cjs modules but after using the system for a while, and talking with a lot of other people in the community, it just seems like the expected behavior and the wanted behavior is to export named based on the keys. This PR implements that in what is hopefully a performant enough solution, although that shouldn't be too much of a problem since this code only runs during initial module loading.
This implementation remains safe with regard to named exports that are also reserved keywords such as
class
ordelete
.EDIT: many changes to original impl, definitely read comments and file diff
Refs: nodejs/node-eps#57
Refs: nodejs/abi-stable-node#256 (comment)
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)
loader