Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fallback to manifest values if they're not specified in ExtismPluginOptions #73

Merged
merged 35 commits into from
Jul 9, 2024

Conversation

mhmd-azeez
Copy link
Collaborator

@mhmd-azeez mhmd-azeez commented Jun 25, 2024

Fixes #58
Fixes #55

  • config
  • allowed_hosts
  • allowed_paths
  • memory
  • timeout
  • tests

@mhmd-azeez mhmd-azeez marked this pull request as ready for review June 25, 2024 19:27
@mhmd-azeez
Copy link
Collaborator Author

mhmd-azeez commented Jun 25, 2024

@chrisdickinson i am on the fence about whether we should contact concat the fields, or only use manifest when nothing is specified in ExtismPluginOptions

@mhmd-azeez mhmd-azeez requested a review from bhelx June 25, 2024 19:28
Copy link
Contributor

@chrisdickinson chrisdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look at this! This PR jogs my memory - I think I was also on the fence about whether to allow the manifest to control allowedHosts/allowedPaths in the absence of explicit configuration from the caller. Since the manifest can be loaded via URL it strikes me that the manifest author and the sdk integrator might not be the same party.

At the risk of complicating things, I'd almost want to have an explicit createPlugin-only option like allowManifestConfig to determine what options the manifest is allowed to control.

src/mod.ts Outdated
@@ -91,7 +99,7 @@ export async function createPlugin(
);
}

const [names, moduleData] = await _toWasmModuleData(await Promise.resolve(manifest), opts.fetch ?? fetch);
const [names, moduleData] = await _toWasmModuleData(m, opts.fetch ?? fetch);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to support manifest options when the manifest is provided via fetch, we might rework _toWasmModuleData to return a third item, manifestOpts, then perform this call before normalizing the options in this function.

intoManifest is the chokepoint at which all the disparate input types are unified into something "manifest-like", so we could extract manifest-provided options here and then use them in createPlugin to finish normalizing the input options.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what you meant?

const manifestOpts : ManifestOptions = {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! That's perfect, thank you!

@mhmd-azeez
Copy link
Collaborator Author

At the risk of complicating things, I'd almost want to have an explicit createPlugin-only option like allowManifestConfig to determine what options the manifest is allowed to control.

I am hesitant about this because this behavior is different from what the other SDKs are doing. Curious what @bhelx thinks

@mhmd-azeez
Copy link
Collaborator Author

Seems like manifest is missing a few options, I will try to add them all in this PR

src/utils.ts Outdated
@@ -0,0 +1,15 @@
export function withTimeout<T>(promise: Promise<T>, timeoutMs?: number | undefined): Promise<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To double-check semantics of the other sdks - in other sdks, it looks like if we cancel an operation due to a timeout, we expect the operation to halt (using increment_epoch).

If I understand this correctly, this is kind of a tricky feature to implement in JS. We can support this for runInWorker: true plugins – we can kill the worker thread on timeout – but I don't think we have a cancelable promise otherwise. This would be user-visible if the plugin has access to host functions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points, while I was able to implement it in the background plugin, the foreground plugin doesn't seem to work:
43e75ac

If I can't find a solution, I might ignore timeout for now and create a separate issue for it

@mhmd-azeez
Copy link
Collaborator Author

mhmd-azeez commented Jun 30, 2024

@chrisdickinson i have added checks for cancellation before all of the extism functions + host functions. So the function call not being cancelled shouldn't be user visible. The only issue is timeouts don't seem to work while a wasm function is being executed. So ForegroundPlugin can't set cancelled to true. Any ideas?

Otherwise, we can just throw an exception when a timeout is specified for runInWorker: false

@chrisdickinson
Copy link
Contributor

@chrisdickinson i have added checks for cancellation before all of the extism functions + host functions. So the function call not being cancelled shouldn't be user visible. The only issue is timeouts don't seem to work while a wasm function is being executed. So ForegroundPlugin can't set cancelled to true. Any ideas?

Ah yeah – unfortunately, since JS is single-threaded & Wasm runs on the main thread, control is never yielded back to the event loop to trigger the timer. I think it's probably best to make timers a runInWorker-only feature. If we do that, I think we might be able to remove the cancellation checks on CallContext and replace them with a worker.terminate() call (& some code to "restart" the worker from src/background-worker.ts.)

@mhmd-azeez
Copy link
Collaborator Author

@chrisdickinson thanks for your suggestion, I think it's much more robust now. Please take a look and lt me know if you have any more feedback

@mhmd-azeez
Copy link
Collaborator Author

for some reason, the tests fail in bun

@mhmd-azeez
Copy link
Collaborator Author

okay, not sure why only Bun is getting stuck. This is the minimal repro:
b6300e4

@chrisdickinson any ideas?

@bhelx
Copy link
Contributor

bhelx commented Jul 2, 2024

Thanks for taking a look at this! This PR jogs my memory - I think I was also on the fence about whether to allow the manifest to control allowedHosts/allowedPaths in the absence of explicit configuration from the caller. Since the manifest can be loaded via URL it strikes me that the manifest author and the sdk integrator might not be the same party.
At the risk of complicating things, I'd almost want to have an explicit createPlugin-only option like allowManifestConfig to determine what options the manifest is allowed to control.

@chrisdickinson I think it's okay that manifest can be a url, but we should just draw bright lines around it. manifest wasn't ever really meant to be something a plug-in dev could author. We have had discussions about having some kind of separate "manifest override" kind of object that would give the host the controls to allow a guest to override specific parts of the manifest. But I think that is a bigger extism wide thing we need to propose and discuss.

@chrisdickinson
Copy link
Contributor

any ideas?

Nothing too concrete. When I've run into Bun hangs and crashes before, it's been an outcome of their focus on performance. That tends to lead to bugs like subtle timing changes – so my first guess would be that they're doing something with the worker that causes it to spin up very quickly the second time its spawned, possibly before we've attached a message listener in the host. We could test this hypothesis by adding some logs to the host to see if or where the createWorker call is hanging – then, if it's hanging while waiting on either of these functions, we might add a setTimeout delay of 1ms or so to the worker before sending the initialized message.

The other thing that strikes me is that we null out this.#request here. That might be enough for the promise to get GC'd in other runtimes, but it could be that Bun's hanging on to it. It might be best to reject that promise by calling if (this.#request) this.#request[1](new Error('timed out')).

@mhmd-azeez
Copy link
Collaborator Author

Ah, it's actually node getting stuck at the end

@mhmd-azeez
Copy link
Collaborator Author

Sorry for doubting you Bun

@bhelx
Copy link
Contributor

bhelx commented Jul 2, 2024

Sorry for doubting you Bun

check your nodejs bias at the door!

@mhmd-azeez
Copy link
Collaborator Author

Okay, as far as I can see, NodeJS can't terminate a worker that's executing a rogue wasm function. Here is a repro in our case:

  async call(funcName: string, input?: string | Uint8Array): Promise<PluginOutput | null> {
    const index = this.#context[STORE](input);

    try {
      const [errorIdx, outputIdx] = CAPABILITIES.supportsTimeouts ?
        await withTimeout(this.callBlock(funcName, index), this.opts.timeoutMs) :
        await this.callBlock(funcName, index);

      const shouldThrow = errorIdx !== null;
      const idx = errorIdx ?? outputIdx;

      if (idx === null) {
        return null;
      }

      const block = this.#context[GET_BLOCK](idx);

      if (block === null) {
        return null;
      }

      const buf = new PluginOutput(
        CAPABILITIES.allowSharedBufferCodec ? block.buffer : new Uint8Array(block.buffer).slice().buffer,
      );

      if (shouldThrow) {
        const msg = new TextDecoder().decode(buf);
        throw new Error(`Plugin-originated error: ${msg}`);
      }

      return buf;
    } catch (err) {
      if (err instanceof Error && err.message === 'Function call timed out') {
        console.log('Function call timed out, restarting worker')
        await this.restartWorker();
      }

      throw err;
    }
  }
test('plugin functions cant exceed specified timeout', async () => {
    let x = 0;

    try {
      const plugin = await createPlugin(
        { wasm: [{ url: 'https://localhost:8124/wasm/sleep.wasm' }], timeoutMs: 1 },
        {
          useWasi: true,
          functions: {
            "extism:host/user": {
              notify() {
                x++;
              },
              get_now_ms() {
                return BigInt(Date.now());
              }
            }
          },
          runInWorker: true
        });

      try {
        const [err, _] = await plugin.call('sleep', JSON.stringify({ duration_ms: 100 })).then(
          (data) => [null, data],
          (err) => [err, null],
        );

        assert(err)
        assert.equal(err.message, 'Function call timed out');
        await new Promise(resolve => setTimeout(resolve, 200));
        assert.equal(x, 0)

      } finally {
        await plugin.close();
      }
    }
    catch (err) {
      if (!CAPABILITIES.supportsTimeouts) {
        assert(err instanceof Error);
        assert.equal(err.message, 'Cannot set timeoutMs; current context does not support timeouts');
      }

      throw err;
    }
  });

async restartWorker() {
    await this.close();

    this.#context[RESET]();
    this.#request = null;

    this.worker = await createWorker(this.opts, this.names, this.modules, this.sharedData);
    this.worker.on('message', (ev) => this.#handleMessage(ev));
  }

async close(): Promise<void> {
    if (this.worker) {
      this.worker.terminate();
      this.worker = null as any;
    }
  }

Everything is executed (even the terminate call), but the process never exits.

I am going to skip timeout in this PR and create a separate issue for it

@mhmd-azeez
Copy link
Collaborator Author

This PR is ready for review

Copy link
Contributor

@chrisdickinson chrisdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whew, great work! I left some comments about places where we might still diverge from other extism host sdk behavior with regards to thrown errors. I also left an optional comment about a foible of mine.

If you'd like to merge from here I think this PR brings us a lot closer to the behavior of other sdks – we can make a ticket for handling error states more akin to how the rust-sdk handles it.

src/background-plugin.ts Show resolved Hide resolved
src/background-plugin.ts Show resolved Hide resolved
src/background-plugin.ts Show resolved Hide resolved
src/call-context.ts Outdated Show resolved Hide resolved
src/background-plugin.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@chrisdickinson chrisdickinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great work!

@mhmd-azeez mhmd-azeez merged commit 389a732 into main Jul 9, 2024
4 checks passed
@mhmd-azeez mhmd-azeez deleted the fix/manifest-options branch July 9, 2024 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manifest.config is not being used manifest: support memory.max_http_response_bytes field
3 participants