Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some optional defines to disable wasm2c trap checks #1432

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

kripken
Copy link
Member

@kripken kripken commented May 18, 2020

  • WASM_RT_NO_MEMORY_CHECKS: Prevents checks from being emitted for memory
    accesses being in bounds.
  • WASM_RT_NO_STACK_DEPTH_CHECKS: Prevents checks from being emitted for
    exhausting the stack.

When compiling a commandline tool like say wasm-opt, those are not relevant,
and they do add significant overhead both to binary size and to speed.

Thoughts?

Copy link
Member

@binji binji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % few comments

"#define FUNC_PROLOGUE \\\n"
" if (++wasm_rt_call_stack_depth > WASM_RT_MAX_CALL_STACK_DEPTH) \\\n"
" TRAP(EXHAUSTION)\n"
"#else\n"
"#define FUNC_PROLOGUE\n"
"#endif\n"
"\n"
"#define FUNC_EPILOGUE --wasm_rt_call_stack_depth\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove here too? Probably doesn't hurt anything, but it seems odd to decrement the value here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, not sure how I missed that, thanks...

@@ -22,8 +26,12 @@
? ((t)table.data[x].func)(__VA_ARGS__) \
: TRAP(CALL_INDIRECT))

#ifndef WASM_RT_NO_MEMORY_CHECKS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe WASM_RT_NO_MEMORY_BOUNDS_CHECKS instead? Not sure what other memory checks we may have, but seems better to be specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

@binji binji mentioned this pull request May 20, 2020
@sunfishcode
Copy link
Member

Can you say more about the motivation for adding WASM_RT_NO_MEMORY_CHECKS? A core part of the WebAssembly ecosystem is that the sandbox is always enabled (outside of debugging an implementation or experimentation). When the sandbox doesn't work or is too slow, we work to find ways to make the sandboxing better rather than disabling it.

@kripken
Copy link
Member Author

kripken commented May 20, 2020

Oh, I wrote this in the other PR just now, but here is the reason: If you compile with wasm2c into a native executable, and then run it from the commandline, the OS will have it sandboxed in a process anyhow. There won't be anything else in the process with it to corrupt (except maybe the little "runtime" in main.c).

Running the emscripten benchmark suite with wasm2c, these checks make a big difference. Often wasm2c is slower than wasm, but faster than wasm and often comparable to a normal native build, when not doing these checks.

Obviously this should not be on by default! And obviously for a library (and not a commandline utility) it would be a bad idea.

@sunfishcode
Copy link
Member

A wasm program is running without linear memory sandboxing can read and write to around 8 GiB of host process address space. Depending on the address space layout, this could be enough to do something which triggers a ROP, and in that case the program can probably break out of a typical OS process.

The WebAssembly design had several choices as to the behavior of out-of-bounds linear memory accesses, including masking/wrapping and nondeterministic behavior, and deliberately chose deterministic trapping, even with the awareness that some implementations would have overhead. If this has turned out to be too burdensome for important implementations, it would be appropriate to take the data to the CG.

I look to WebAssembly-organization projects such as wabt to lead by example in following the spec. There are many parts of the spec which could be ignored to achieve significant benchmark wins on some implementations today. But that's not WebAssembly.

@kripken
Copy link
Member Author

kripken commented May 20, 2020

I definitely agree with you in general @sunfishcode . However, the use case I have here (as mentioned in #1434) is compiling trusted code using wasm2c, where a ROP isn't relevant.

I agree that that is not a standard WebAssembly use case! But wasm toolchain components, like wasm2c here, are potentially useful for more than the standard things, and optional support for those would be cool.

@binji
Copy link
Member

binji commented May 20, 2020

I agree that a better solution (at least for wasm-opt) in general would be to provide a cheaper memory sandbox (a la trap handling or the like). It should be possible to do that in wasm2c at some point.

That said, I have experimented with disabling these checks before too (for performance analysis), so I can see the value. We certainly shouldn't encourage their use for most programs, though I wonder how much worse the security is compared to a native executable.

@sunfishcode Perhaps as a compromise we should avoid mentioning these defines in the docs? Or maybe change their names UNSAFE_DISABLE_MEMORY_CHECKS etc.?

@sunfishcode
Copy link
Member

I agree that for the specific use case @kripken describes here where the program is trusted anyway, security is not a concern. And I agree that performance experimentation is cool, though it's pretty easy for people to experiment by editing the code.

I don't think a project as prominent as wabt should have options that intentionally violate WebAssembly standards, for the purpose of allowing third-party projects to depend on them.

@binji
Copy link
Member

binji commented May 26, 2020

@sunfishcode OK, but what it is the path forward here? I guess you could say that for @kripken's use case, it's up to them to run a sed script over the output, or modify the output manually. That feels like a worse outcome, but maybe it's OK for now.

Let's do this -- for now, let's hold off on this PR. I'm interested in adding trap-handling to wasm2c anyway (it'll be useful for experimenting with memory64 sandboxing too), which will make this PR moot. I've added an issue for this here: #1440

@kripken
Copy link
Member Author

kripken commented May 26, 2020

Trap handling will help a lot in many cases, but I don't think it would make this PR moot. There are places where trap handlers aren't an option for various reasons (32-bit systems, places without full OS permissions).

@binji
Copy link
Member

binji commented May 26, 2020

@kripken Good point. Here's another slightly less opinionated option -- we change the various defines (MEMCHECK, FUNC_PROLOGUE) to check whether they're already defined. In practice this is not much different, but it is less like an official option provided by wasm2c this way.

@kripken
Copy link
Member Author

kripken commented May 27, 2020

I'd be ok with checking for the defines. Sounds ok for now.

But in the long term I think we should resolve this debate. For example, it would be interesting if wasm2wat could turn

boundscheck(x)
load(x)
..no branches..
boundscheck(x + 4)
load(x + 4)
..no branches..
boundscheck(x + 8)
load(x + 8)

into

boundscheck(x + 8)
load(x)
..no branches..
load(x + 4)
..no branches..
load(x + 8)

which saves 2 checks. This changes wasm semantics slightly since we can trap earlier (for that reason I believe C compilers can't do it automatically). This could be a large optional speedup which we can't do with a previous #define. It would be unfortunate if we couldn't do this kind of thing in wasm2wat and had to resort to some kind of post-processing tool.

@sunfishcode
Copy link
Member

I appreciate the creativity, but ultimately, checking whether MEMCHECK is already defined would still be setting up an extension point for the purpose of letting downstream dependencies alter wasm spec requirements.

(If a general-purpose tracing or debugging extension point is desired, such a thing should be designed to be independent of the sandboxing mechanism anyway.)

An expectation of spec conformance is one of the tools the WebAssembly community has for pressuring everyone, even powerful companies, to play by the same rules, and while I recognize it has practical limitations, I am opposed to weakening it.

@kripken
Copy link
Member Author

kripken commented May 27, 2020

@sunfishcode

I think it would be useful for you to explain your position some more - it's a novel one that I've never encountered before. Perhaps I am misunderstanding you.

To avoid confusion, here is my position: any wasm VM should conform to the wasm spec, period. Toolchain components that consume and produce wasm should do so according to the spec, period. However, a toolchain may also do more things. For example, wabt today can read wasm and emit a C-like format using wasm-decompile, and binaryen can read wasm and emit asm.js-like JS using wasm2js. Those output formats are not wasm.

wasm-decompile doesn't emit runnable code, but wasm2js does, and it does not 100% conform to wasm semantics, intentionally, because that would not work - bounds checking and alignment checking each load and store would be too slow when using JS Typed Arrays. This is ok because wasm2js output is not wasm, it's JS.

The proposed wasm2c stuff in this PR is directly analogous: it's not wasm, it's C.

(If you disagree, do you also disagree on wasm-decompile and/or wasm2js?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants