Design: Proposal: Await

Created on 18 May 2020  ·  96Comments  ·  Source: WebAssembly/design

@rreverser and I would like to propose a new proposal for WebAssembly: Await.

The motivation for the proposal is to help "synchronous" code compiled to WebAssembly, that does something like a read from a file:

fread(buffer, 1, num, file);
// the data is ready to be used right here, synchronously

This code can't easily be implemented in a host environment which is primarily asynchronous, and which would implement "read from a file" asynchronously, for example on the Web,

const result = fetch("http://example.com/data.dat");
// result is a Promise; the data is not ready yet!

In other words, the goal is to help with the sync/async issue that is so common with wasm on the Web.

The sync/async issue is a serious problem. While new code can be written with it in mind, large existing codebases often cannot be refactored to work around it, which means they cannot run on the Web. We do have Asyncify which instruments a wasm file to allow pausing and resuming, and which has allowed some such codebases to be ported, so we are not completely blocked here. However, instrumenting the wasm has significant overhead, something like a 50% increase to code size and a 50% slowdown on average (but sometimes much worse), because we add instructions to write out / read back in the local state and call stack and so forth. That overhead is a big limitation and it rules out Asyncify in many cases!

This proposal's goal is to allow pausing and resuming execution in an efficient way (in particular, without overhead like Asyncify has) so that all applications that encounter the sync/async problem can easily avoid it. Personally we intend this primarily for the Web, where it can help WebAssembly integrate better with Web APIs, but use cases outside the Web may be relevant as well.

The idea in brief

The core problem here is between wasm code being synchronous and the host environment which is asynchronous. Our approach is therefore focused on the boundary of a wasm instance and the outside. Conceptually, when a new await instruction is executed, the wasm instance "waits" for something from the outside. What "wait" means would differ on different platforms, and may not be relevant on all platforms (like not all platforms may find the wasm atomics proposal relevant), but on the Web platform specifically, the wasm instance would wait on a Promise and pause until that resolves or rejects. For example, a wasm instance could pause on a fetch network operation, and be written something like this in .wat:

;; call an import which returns a promise
call $do_fetch
;; wait for the promise just pushed to the stack
await
;; do stuff with the result just pushed to the stack

Note the general similarity to await in JS and other languages. While this is not identical to them (see details below) the key benefit is that it allows writing synchronous-looking code (or rather, to compile synchronous-looking code into wasm).

The details

Core wasm spec

The changes to the core wasm spec are very minimal:

  • Add a waitref type.
  • Add an await instruction.

A type is specified for each await instruction (like call_indirect), for example:

;; elaborated wat from earlier, now with full types

(type $waitref_=>_i32 (func (param waitref) (result i32)))
(import "env" "do_fetch" (func $do_fetch (result waitref)))

;; call an import which returns a promise
call $do_fetch
;; wait for the promise just pushed to the stack
await (type $waitref_=>_i32)
;; do stuff with the result just pushed to the stack

The type must receive a waitref, and can return any type (or nothing).

await is only defined in terms of making the host environment do something. It is similar in that sense to the unreachable instruction, which on the Web makes the host throw a RuntimeError, but that isn't in the core spec. Likewise, the core wasm spec only says await is meant to wait for something from the host environment, but not how we would actually do so, which might be very different in different host environments.

That's it for the core wasm spec!

Wasm JS spec

The changes to the wasm JS spec (which affect only JS environments like the Web) are more interesting:

  • A valid waitref value is a JS Promise.
  • When an await is executed on a Promise, the entire wasm instance pauses and waits for that Promise to resolve or reject.
  • If the Promise resolves, the instance resumes execution after pushing to the stack the value received from the Promise (if there is one)
  • If the Promise rejects, we resume execution and throw a wasm exception from the location of the await.

By "the entire wasm instance pauses" we mean all local state is preserved (the call stack, local values, etc.) so that we can resume the current execution later, as if we never paused (of course global state may have changed, like the Memory may have been written to). While we wait, the JS event loop functions normally, and other things can happen. When we resume later (if we don't reject the Promise, in which case an exception would be thrown) we continue exactly where we left off, basically as if we never paused (but meanwhile other things have happened, and global state may have changed, etc.).

What does it look like to JS when it calls a wasm instance which then pauses? To explain that, let's first take a look at a common example encountered when porting native applications to wasm, an event loop:

void event_loop_iteration() {
  // ..
  while (auto task = getTask()) {
    task.run(); // this *may* be a network fetch
  }
  // ..
}

Imagine that this function is called once per requestAnimationFrame. It executes the tasks given to it, which might include: rendering, physics, audio, and network fetching. If we have a network fetch event, then and only then do we end up running an await instruction on the fetch's Promise. We may do that 0 times for one call of event_loop_iteration, or 1 time, or many times. We only know whether we end up doing so during the execution of this wasm - not before, and in particular not in the JS caller of this wasm export. So that caller must be ready for the instance to either pause or not.

A somewhat analogous situation can happen in pure JavaScript:

function foo(bar) {
  // ..
  let result = bar(42);
  // ..
}

foo gets a JS function bar and calls it with some data. In JS bar may be an async function or it may be a normal one. If it's async, it returns a Promise, and only finishes execution later. If it's normal, it executes before returning and returns the actual result. foo can either assume that it knows which kind bar is (no type is written in JS, in fact bar may not even be a function!), or it can handle both types of functions to be fully general.

Now, normally you know exactly what set of functions bar might be! For example, you may have written foo and the possible bars in coordination, or documented exactly what the expectations are. But the wasm/JS interaction we are talking about here is actually more similar to the case where you don't have such a tight coupling between things, and where in fact you need to handle both cases. As mentioned earlier, the event_loop_iteration example requires that. But even more generally, often the wasm is your compiled application while the JS is generic "runtime" code, so that JS has to handle all cases. JS can easily do so, of course, for example using result instanceof Promise to check the result, or use JS await:

async function runEventLoopIteration() {
  // await in JavaScript can handle Promises as well as regular synchronous values
  // in the same way, so the log is guaranteed to be written out consistently after
  // the operation has finished (note: this handles 0 or 1 iterations, but could be
  // generalized)
  await wasm.event_loop_iteration();
  console.log("the event loop iteration is done");
}

(note that if we don't need that console.log then we wouldn't need the JS await in this example, and would have just a normal call to a wasm export)

To summarize the above, we propose that the behavior of a pausing wasm instance be modeled on the JS case of a function that may or may not be async, which we can state as:

  • When an await is executed, the wasm instance immediately exits back out to whoever called into it (typically that would be JS calling a wasm export, but see notes later). The caller receives a Promise which it can use to know when execution of the wasm concludes, and to get a result if there is one.

Toolchain / library support

In our experience with Asyncify and related tools it is easy (and fun!) to write a little JS to handle a waiting wasm instance. Aside from the options mentioned earlier, a library could do one of the following:

  1. Wrap around a wasm instance to make its exports always return a Promise. That gives a nice simple interface to the outside (however, it adds overhead to quick calls into wasm that do not pause). This is what the standalone Asyncify helper library does, for example.
  2. Write some global state when an instance pauses and check that from the JS that called into the instance. That is what Emscripten's Asyncify integration does, for example.

A lot more can be built on top of such approaches, or other ones. We prefer to leave all that to toolchains and libraries to avoid complexity in the proposal and in VMs.

Implementation and Performance

Several factors should help keep VM implementations simple:

  1. A pause/resume occurs only on an await, and we know their locations statically inside each function.
  2. When we resume we continue exactly from where we left things, and we only do so once. In particular, we never "fork" execution: nothing here returns twice, unlike C's setjmp or a coroutine in a system that allows cloning/forking.
  3. It is acceptable if the speed of an await is slower than a normal call out to JS, since we will be waiting on a Promise, which at minimum implies a Promise was allocated and that we wait on the event loop (which has minimum overhead plus potentially waiting for other things currently running). That is, the use cases here do not demand that VM implementers find ways to make await blazingly-fast. We only want await to be efficient compared to the requirements here, and in particular expect it to be far faster than Asyncify's large overhead.

Given the above, a natural implementation is to copy the stack when we pause. While that has some overhead, given the performance expectations here it should be very reasonable. And if we copy the stack only when we pause then we can avoid doing extra work to prepare for pausing. That is, there should be no extra general overhead (which is very different from Asyncify!)

Note that while copying the stack is a natural approach here, it is not a completely trivial operation, as the copy may not be a simple memcpy, depending on the VM's internals. For example, if the stack contains pointers to itself, then those would either need to be adjusted, or for the stack to be relocatable. Alternatively, it might be possible to copy the stack back to its original position before resuming it, since as mentioned earlier it is never "forked".

Note also that nothing in this proposal requires copying the stack. Perhaps some implementations can do other things, thanks to the simplifying factors mentioned in the 3 points from earlier in this section. The observable behavior here is fairly simple, and explicit stack handling is not part of it.

We are very interested to hear VM implementor feedback on this section!

Clarifications

This proposal only pauses WebAssembly execution back out to the caller of the wasm instance. It does not allow pausing host (JS or browser) stack frames. await operates on a wasm instance, only affecting stack frames inside it.

It is ok to call into the WebAssembly instance while a pause has occurred, and multiple pause/resume events can be in flight at once. (Note that if the VM takes the approach of copying the stack then this does not mean a new stack must be allocated each time we enter the module, as we only need to copy it if we actually pause.)

Connection to other proposals

Exceptions

Promise rejection throwing an exception means that this proposal depends on the wasm exceptions proposal.

Coroutines

Andreas Rossberg's coroutines proposal also deals with pausing and resuming execution. However, while there is some conceptual overlap, we don't think the proposals compete. Both are useful because they are focused on different use cases. In particular, the coroutines proposal allows coroutines to be switched between inside wasm, while the await proposal allows an entire instance to wait for the outside environment. And the way in which both things are done leads to different characteristics.

Specifically, the coroutines proposal handles stack creation in an explicit manner (instructions are provided to create a coroutine, to pause one, etc.). The await proposal only talks about pausing and resuming and therefore the stack handling is implicit. Explicit stack handling is appropriate when you know you are creating specific coroutines, while implicit is appropriate when you only know you need to wait for something during execution (see the example from before with event_loop_iteration).

The performance characteristics of those two models may be very different. If for example we created a coroutine every time we ran code that might pause (again, often we don't know in advance) that might allocate memory unnecessarily. The observed behavior of await is simpler than what general coroutines can do and so it may be simpler to implement.

Another significant difference is that await is a single instruction that provides all a wasm module needs in order to fix the sync/async mismatch wasm has with the Web (see the first .wat example from the very beginning). It is also very easy to use on the JS side which can just provide and/or receive a Promise (while a little library code may be useful to add, as mentioned earlier, it can be very minimal).

In theory the two proposals could be designed to be complementary. Perhaps await could be one of the instructions in the coroutines proposal somehow? Another option is to allow an await to operate on a coroutine (basically giving a wasm instance an easy way to wait on coroutine results).

WASI#276

By coincidence WASI #276 was posted by @tqchen just as we were finishing to write this up. We're very happy to see that as it shares our belief that that coroutines and async support are separate functionalities.

We believe that an await instruction could help implement something very similar to what is proposed there (option C3), with the difference that there would not need to be special async syscalls, but rather some syscalls could return a waitref which can then be await-ed.

For JavaScript we defined waiting as pausing a wasm instance, which makes sense because we can have multiple instances as well as JavaScript on the page. However, in some server environments there might only be the host and a single wasm instance, and in that case, waiting can be much simpler, perhaps literally waiting on a file descriptor or on the GPU. Or waiting could pause the entire wasm VM but keep running an event loop. We don't have specific ideas here ourselves, but based on the discussion in that issue there may be interesting possibilities here, we're curious what people think!

Corner case: wasm instance => wasm instance => await

In a JS environment, when a wasm instance pauses it returns immediately to whoever called it. We described what happens if the caller is from JS, and the same thing happens if the caller is the browser (for example, if we did a setTimeout on a wasm export that pauses; but nothing interesting happens there, as the returned Promise is just ignored). But there is another case, of the call coming from wasm, that is, where wasm instance A directly calls an export from instance B, and B pauses. The pause makes us immediately exit out of B and return a Promise.

When the caller is JavaScript, as a dynamic language this is less of an issue, and in fact it's reasonable to expect the caller to check the type as discussed earlier. When the caller is WebAssembly, which is statically typed, this is awkward. If we don't do something in the proposal for this then the value will be cast, in our example from a Promise to whatever instance A expects (if an i32, it would be cast to a 0). Instead, we suggest that an error occur:

  • If a wasm instance calls (directly or using call_indirect) a function from another wasm instance, and while running in the other instance an await is executed, then a RuntimeError exception is thrown from the location of the await.

Importantly, this could be done with no overhead unless pausing, that is, keeping normal wasm instance -> wasm instance calls at full speed, by checking the stack only when doing a pause.

Note that users that do want something like a wasm instance to call another and have the latter pause can do so, but they need to add some JS in between the two.

Another option here is for a pause to propagate to the calling wasm as well, that is, all wasm would pause all the way out to JS, potentially spanning multiple wasm instances. This has some advantages, like wasm module boundaries stop mattering, but also downsides, like the propagation being less intuitive (the calling instance's author may not expect such behavior) and that adding JS in the middle could change the behavior (also potentially unexpectedly). Requiring that users have JS in between, as mentioned earlier, seems less risky.

Another option might be for some wasm exports to be marked async while others are not, and then we could know statically what is what, and not allow improper calls; but see the event_loop_iteration example from earlier which is a common case that would not be solved by marking exports, and there are also indirect calls, so we can't avoid the issue that way.

Alternative approaches considered

Perhaps we don't need a new await instruction at all, if wasm pauses whenever a JS import returns a Promise? The problem is that right now when JS returns a Promise that is not an error. Such a backwards-incompatible change would mean wasm can no longer receive a Promise without pausing, but that might be useful too.

Another option we considered is to mark imports somehow to say "this import should pause if it returns a Promise". We thought about various options for how to mark them, on either the JS or the wasm side, but didn't find anything that felt right. For example, if we mark imports on the JS side then the wasm module would not know if a call to an import pauses or not until the link step, when imports arrive. That is, calls to imports and pausing would be "mixed together". It seems like the most straightforward thing is to just have a new instruction for this, await, which is explicit about waiting. In theory such a capability may be useful outside the Web as well (see notes earlier), so having an instruction for everyone may make things more consistent overall.

Previous related discussions

https://github.com/WebAssembly/design/issues/1171
https://github.com/WebAssembly/design/issues/1252
https://github.com/WebAssembly/design/issues/1294
https://github.com/WebAssembly/design/issues/1321

Thank you for reading, feedback is welcome!

Most helpful comment

I was hoping to have more discussion publicly here, but to save time I reached out to some VM implementers directly, as few have engaged here so far. Given their feedback together with the discussion here, sadly I think we should pause this proposal.

Await has much simpler observable behavior than general coroutines or stack switching, but the VM people I talked to agree with @rossberg that the VM work in the end would probably be similar for both. And at least some VM people believe we will get coroutines or stack switching anyhow, and that we can support await's use cases using that. That will mean creating a new coroutine/stack on each call into the wasm (unlike with this proposal), but at least some VM people think that could be made fast enough.

In addition to the lack of interest from VM people, we have had some strong objections to this proposal here from @fgmccabe and @RossTate , as discussed above. We disagree on some things but I appreciate those points of view, and the time that went into explaining them.

In conclusion, overall it feels like it would be a waste of everyone's time to try to move forward here. But thank you to everyone that participated in the discussion! And hopefully at least this motivates prioritizing coroutines / stack switching.

Note that the JS part of this proposal may be relevant in the future, as JS sugar basically for convenient Promise integration. We'll need to wait for stack switching or coroutines and see if this could work on top of that. But I don't think it's worth keeping the issue open for that, so closing.

All 96 comments

Excellent write-up! I like the idea of host-controlled suspension. @rossberg's proposal also discusses functional effect systems, and I admittedly am not an expert with them, but at first glance it seems like those could fulfill the same non-local control flow need.

Regarding: "Given the above, a natural implementation is to copy the stack when we pause." How would this work for the execution stack? I imagine that most JIT engines share the native C execution stack between JS and wasm so I'm not sure what saving and restoring would mean in this context. Does this proposal mean that wasm execution stack would need to be somehow virtualized? IIUC avoiding the use of the C stack like this was pretty tricky when python tried to do something similar: https://github.com/stackless-dev/stackless/wiki.

I share a similar worry to @sbc100. Copying the stack is inherently quite a difficult operation, especially if your VM doesn't already have a GC implementation.

@sbc100

Does this proposal mean that wasm execution stack would need to be somehow virtualized?

I have to leave this to VM implementers as I'm not an expert on it. And I don't understand the connection to stackless python, but perhaps I don't know what that is well enough to understand the connection, sorry!

But in general: various coroutine approaches work by manipulating the stack pointer at a low level. Those approaches may be an option here. We wanted to point out that even if the stack has to be copied as part of such an approach, doing so has acceptable overhead in this context.

(We are not certain if those approaches can work in wasm VMs or not - hoping to hear from implementers if yes or no, and whether there are better options!)

@lachlansneff

Can you please explain in more detail what you mean by GC making things easier? I don't follow.

@kripken GCs often (but not always) have the ability to walk a stack, which is necessary if you need to rewrite pointers on the stack to point to the new stack. I believe JSC doesn't have that ability, so I don't believe it would be possible to deep copy stacks with their VM. Perhaps someone who knows more about JSC can confirm or deny this.

@lachlansneff

Thanks, now I see what you're saying.

We do not suggest that walking the stack in such a full way (identifying each local all the way up, etc.) is necessary to do this. (For other possible approaches, see the link in my last comment about low-level coroutine implementation methods.)

I apologize for the terminology of "copy the stack" in the proposal - I see that it was not clear enough, based on your and @sbc100 's feedback. Again, we don't want to suggest a specific VM implementation approach. We just wanted to say that if copying the stack is necessary in some approach, that would not be a problem for speed.

Rather than suggest a specific implementation approach, we hope to hear from VM people how they think this could be done!

I'm very excited to see this proposal. Lucet has had yield and resume operators for a while now, and we use them precisely for interacting with async code running in the Rust host environment.

This was fairly straightforward to add to Lucet, since our design already committed to maintaining a separate stack for Wasm execution, but I could imagine it could present some implementation difficulties for VMs that don't.

This proposall sounds great! We have been trying to get into a good way for managing async code on wasmer-js for a little bit (since we have no access to the VM internals in a browser context).

Rather than suggest a specific implementation approach, we hope to hear from VM people how they think this could be done!

I think perhaps using the callback strategy for async functions might be the easiest way to get things rolling and also in a language-agnostic way.

It seems .await can be called in a JsPromise inside a Rust function using wasm-bindgen-futures? How this can work without the await instruction proposed here? I'm sorry for my ignorance, I'm looking for solutions to call fetch inside wasm and I'm learning about Asyncify, but it seams that the Rust solution is simpler. What I'm missing here? Can someone make it clear for me?

I am very excited about this proposal. The main advantage of the proposal is its simplicity, as we can build APIs that are synchronize to the wasm's POV, and it makes it much easier to port applications without having to explicitly think about callbacks and async/await. It would enable us to bring WASM and WebGPU based machine learning to native wasm vms using a single native API and run on both web and native.

One thing that I think worth discussing is the signature of the functions that potentially calls await. Imagine that we have the following function

int test() {
   await();
   return 1;
}

The signature of the corresponding function is () => i32. Under the new proposal, calls into test could either returns i32, or a Promise<i32>. Note that it is harder to ask user to statically declare the a new signature(because the cost of code-porting, and could be indirect calls inside the function that we don't know that calls await).

Should we have a separate call mode into the exported function(e.g. async call) to indicate await is permitted during runtime?

Terminology-wise, the proposed operation is like a yield operation in operation systems. Since it yields the control to the OS(in this case the wasm VM) to wait for the syscall to finsih.

If I understand this proposal correctly, I think it's roughly equivalent to removing the restriction that the await in JS be only usable in async functions. That is, on the wasm side waitref could be externref and rather than an await instruction you could have an imported function $await : [externref] -> [], and on the JS side you could supply foo(promise) => await promise as the function to import. In the other direction, if you were JS code that wanted to await on a Promise outside of async function, you could supply that promise to a wasm module that simply calls await on the input. Is that a correct understanding?

@RossTate Not quite, AIUI. The wasm code can await a promise (call it promise1), but only the wasm execution will yield, not the JS. The wasm code will return a different promise (call it promise2) to the JS caller. When promise1 resolves, then the wasm execution continues. Finally, when that wasm code exits normally, then promise2 will resolve with the wasm function's result.

@tqchen

Should we have a separate call mode into the exported function(e.g. async call) to indicate await is permitted during runtime?

Interesting - where do you see the benefit? As you said, there is really no way to be sure if an export will end up doing an await or not, in common porting situations, so at best it could only be used sometimes. Would this help VMs internally though maybe?

Having an explicit declaration might make sure that the user state their intent clearly, and the VM could throw a proper error message if the user's intent is not doing a call that runs async.

From the user's POV it also makes the code writing more consistent. For example, the user would could write the following code, even if test does not call a await, and the system interface returns Promise.resolve(test()) automatically.

await inst.exports_async.test();

It seems .await can be called in a JsPromise inside a Rust function using wasm-bindgen-futures ? How this can work without the await instruction proposed here? I'm sorry for my ignorance, I'm looking for solutions to call fetch inside wasm and I'm learning about Asyncify, but it seams that the Rust solution is simpler. What I'm missing here? Can someone make it clear for me?

@malbarbo There is little overlap between two despite the similar use-cases; what Rust is doing is essentially full coroutines, which are more in the scope of the other linked proposal.

It's more flexible, but also requires more involvement and overhead from both the language and the codebase - it has to have a concept of async functions natively, as well as mark every single function in the call chain as such.

What this proposal is trying to achieve instead is a way to wait for host-provided syscalls, where them being async is only an implementation detail, and so such functions can be called from anywhere in an existing codebase in a backwards-compatible way, without having to rewrite how whole app operates. (Example being file I/O, which source languages including C / C++ / Rust normally expect to be available synchronously, but it isn't e.g. on the Web.)

From the user's POV it also makes the code writing more consistent. For example, the user would could write the following code, even if test does not call a await, and the system interface returns Promise.resolve(test()) automatically.

@tqchen Note that user can already do this as shown in example in the proposal test. That is, JavaScript already supports and handles both synchronous and asynchronous values in an await operator in the same fashion.

If the suggestion is to enforce a single static type, then we believe this can be done on either lint or type system level or a JavaScript wrapper level without introducing complexity on the core WebAssembly side or restricting implementers of such wrappers.

Ah, thanks for the correction, @binji.

In that case, is the following roughly equivalent? Add a WebAssembly.instantiateAsync(moduleBytes, imports, "name1", "name2") function to the JS API. Suppose moduleBytes has a number of imports plus an additional import import "name1" "name2" (func (param externref)). Then this function instantiates the imports with the values given by imports and instantiates the additional import with what is conceptually await. When exported functions are created from this module, they get guarded so that when this await is called it walks up the stack to find the first guard and then copies the contents of the stack over into a new Promise that is then immediately returned.

Would that work? My sense is that this proposal can be done solely by modifying the JS API without need to modify WebAssembly itself. Of course, even then it still adds a lot of useful functionality.

@kripken How would the start function be handled? Would it statically disallow await, or would it somehow interact with Wasm instantiation?

@malbarbo wasm-bindgen-futures allows you to run async code in Rust. That means you have to write your program in an async way: you have to mark your functions as async, and you need to use .await. But this proposal allows you to run async code without using async or .await, instead it looks like a regular synchronous function call.

In other words, you cannot currently use synchronous OS APIs (like std::fs) because the web only has async APIs. But with this proposal you could use synchronous OS APIs: they would internally use Promises, but they would look synchronous to Rust.

Even if this proposal is implemented, wasm-bindgen-futures will still exist and will still be useful, because it's handling a different use case (running async functions). And async functions are useful because they can be easily parallelized.

@RossTate It seems your suggestion is quite similar to one covered in "Alternative approaches considered":

Another option we considered is to mark imports somehow to say "this import should pause if it returns a Promise". We thought about various options for how to mark them, on either the JS or the wasm side, but didn't find anything that felt right. For example, if we mark imports on the JS side then the wasm module would not know if a call to an import pauses or not until the link step, when imports arrive. That is, calls to imports and pausing would be "mixed together". It seems like the most straightforward thing is to just have a new instruction for this, await, which is explicit about waiting. In theory such a capability may be useful outside the Web as well (see notes earlier), so having an instruction for everyone may make things more consistent overall.

How would the start function be handled? Would it statically disallow await, or would it somehow interact with Wasm instantiation?

@Pauan We didn't cover this specifically, but I think there's nothing stopping us from allowing await in start as well. In this case the Promise returned from instantiate{Streaming} would still naturally resolve/reject when the start function has finished executing completely, with the only difference being that it would wait for awaited promises.

That said, same limitations as today apply and for now it wouldn't be too useful for cases that require access to e.g. the exported memory.

@RReverser How would that work for the synchronous new WebAssembly.Instance (which is used in workers)?

Interesting point @Pauan about start!

Yeah, for synchronous instantiation it seems risky - if await is allowed, it's odd if someone calls into the exports while it's paused. Disallowing await there may be simplest and safest. (Perhaps also in async start for consistency, there don't seem to be important use cases that would prevent? Needs more thought.)

(which is used in workers)?

Hmm good point; I don't think it has to be used in Workers, but since this API already exists, perhaps it could return a Promise? I've seen this as a semi-popular emerging pattern to return thenables from a constructor of various libraries, although not sure if it's a good idea to do this in a standard API.

I agree disallowing it in start (as in trapping) is safest for now, and we can always change that in the future in a backwards-compatible way should something change.

Maybe I missed something, but there is no discussion of what happens when the WASM execution is paused with an await instruction and a promise returned to JS, then JS calls back into WASM without waiting on the promise.

Is that a valid use case? If it is, then it could allow "main loop" applications to receive input events without yielding to the browser manually. Instead they could yield back by awaiting on a promise that's resolved immediately.

What about cancellation? It's not implemented in JS promises and this causes some issues.

@Kangz

Maybe I missed something, but there is no discussion of what happens when the WASM execution is paused with an await instruction and a promise returned to JS, then JS calls back into WASM without waiting on the promise.

Is that a valid use case? If it is, then it could allow "main loop" applications to receive input events without yielding to the browser manually. Instead they could yield back by awaiting on a promise that's resolved immediately.

The current text is perhaps not clear enough on that. For the first paragraph, yes, that is allowed, see the "Clarifications" section: It is ok to call into the WebAssembly instance while a pause has occurred, and multiple pause/resume events can be in flight at once.

For the second paragraph, no - you can't get events earlier, and you can't make JS resolve a Promise earlier than it would. Let me try to summarize things in another way:

  • When wasm pauses on Promise A, it exits back out to whatever called it, and returns a new Promise B.
  • Wasm resumes when Promise A resolves. That happens at the normal time, which means everything is normal in the JS event loop.
  • After wasm resumes and also finishes running, only then is Promise B resolved.

So in particular Promise B has to resolve after Promise A. You can't get the result of Promise A earlier than JS can get it.

To put it another way: this proposal's behavior can be polyfilled by Asyncify + some JS that uses Promises around it.

@RReverser, I don't think those are the same, but first I think we need to clarify something (if it hasn't already been clarified, in which case I'm sorry for missing it).

There can be multiple calls from JS into the same wasm instance on the same stack at the same time. If await gets executed by the instance, which call gets paused and returns a promise?

For the second paragraph, no - you can't get events earlier, and you can't make JS resolve a Promise earlier than it would.

Sorry I think my question wasn't clear. At the moment "main-loop" apps in C++ use emscripten_set_main_loop so that between each run of the frame function, control is yielded back to the browser, and input or other events can be processed.

With this proposal, it seems like the following should work to translate "main-loop" apps. (though I don't know the JS event loop well)

int main() {
  while (true) {
    frame();
    processEvents();
  }
}

// polyfillable with ASYNCIFY!
void processEvents() {
  __builtin_await(EM_ASM(
    new Promise((resolve, reject) => {
      setTimeout(0, () => resolve());
    })
  ))
}

@Kangz That should work, yes (except you have a small issue with order of arguments in your setTimeout code plus it could be simplified):

int main() {
  while (true) {
    frame();
    processEvents();
  }
}

// polyfillable with ASYNCIFY!
void processEvents() {
  __builtin_await(EM_ASM_WAITREF(
    return new Promise(resolve => setTimeout(resolve));
  ));
}

There can be multiple calls from JS into the same wasm instance on the same stack at the same time. If await gets executed by the instance, which call gets paused and returns a promise?

The innermost one. It's job of the JS wrapper to coordinate the rest if it wishes to do so.

@Kangz Sorry, I misunderstood you before then. Yes, as @RReverser said that should work, and it's a good example of an intended use case here!

As you said it's polyfillable with Asyncify, and in fact it's equivalent to the same code with Asyncify today by replacing the __builtin_await with a call to emscripten_sleep(0) (which does a setTimeout(0)).

Thanks, @RReverser, for the clarification. I think it would help to rephrase the description to say that the (most recent) call into the instance pauses, rather than the instance itself.

In that case, this sounds almost equivalent to adding the following two primitive functions to JS: promise-on-await(f) and await-for-promise(p). The former calls f() but, if during the execution of f() a call is made to await-for-promise(p), instead returns a new Promise that resumes execution after p resolves and itself resolves after that execution is complete (or calls await-for-promise again). If a call to await-for-promise is made within the context of multiple promise-on-awaits, then the most recent one returns a Promise. If a call to await-for-promise is made outside of any promise-on-await, then something bad happens (just like if an instance's start code executes await).

Does that make sense?

@RossTate That's quite close, yes, and captures the general idea. (But as you said, only almost equivalent, as it couldn't be used to polyfill this, and it's missing the specific wasm/JS boundary handling.)

Thanks for the suggestion to rephrase that text. I'm keeping a list of such notes from the discussion here. (I'm not sure if it's worth applying them to the first post, as it seems less confusing to not change it over time?)

@RossTate Interesting... I like this! It makes the async nature of the call explicit (promise-on-await is required for any potentially async call), and doesn't require any changes to Wasm. It also makes (some) sense if you remove Wasm from the middle -- if promise-on-await calls await-for-promise directly, then it returns a Promise.

@kripken can you go into more detail about why this would be different? I don't quite understand why the Wasm/JS boundary matters here.

@binji I just meant that such functions in JS would not let wasm do something similar. Calling them as imports from wasm wouldn't work. We still need a way to make wasm exit out to the boundary etc. in a resumable way, don't we?

@kripken right, I guess at that point the await-for-promise import would have to be functioning like a Wasm intrinsic.

My thinking was that, instead of adding an await instruction to wasm, such a module would instead import await-for-promise and call that. Similarly, instead of changing the exported functions, JS code would call them inside a promise-on-await. This means that the JS primitives would handle all the stack work, including the WebAssembly stack. It also would be more flexible, e.g. if you want you could give the module a JS callback that can then call back into the module and have the outer call pause instead of the inner clause---it all depends on whether the JS code chooses to wrap the call in promise-on-await or not. I don't think you need to change anything to wasm itself.

I would be interested to hear what @syg thinks about these potential JS primitives.

Oh ok, sorry - I took your comment @RossTate to be "to make sure I understand, let me rephrase it like this, and tell me if that has the right shape", and not a concrete suggestion.

Thinking about it, your idea wants to pause not just JS frames but also wasm, but there are also host/browser frames. (The current proposal avoids that by only working on wasm up the boundary where it was called into.) Here's an example:

myList.forEach((item) => {
  .. call something which ends up pausing ..
});

If forEach is implemented in the browser code then it means pausing browser frames. Also significant is that pausing in the middle of such a loop, and resuming later, would be a new power that JS can do, and your idea would allow that for a normal loop too:

for (let i of something) {
  .. call something which ends up pausing ..
}

And all this may have curious spec interactions with async JS functions. These all seem like large discussions to have with browser and JS people.

But also, this only avoids adding await and waitref into the core wasm spec, but those are tiny additions - since they do nothing in the core spec. The current proposal already has 99% of the complexity on the JS side. And IIUC your proposal trades off that small addition to the wasm spec with much larger additions on the JS side - so it makes the Web platform as a whole more complex, and unnecessarily since this is all for wasm. Plus, there is actually a benefit to defining await in the core wasm spec, that it may be useful outside the Web.

Maybe I've missed something in your suggestion, apologies if so. Overall, I'm curious what your motivation is for trying to avoid an addition to the core wasm spec?

I don't think those primitives make much sense for js, and I think more wasm implementations than the ones in browsers can benefit from this. I'm still curious why resumable exceptions (roughly effects) wouldn't fulfill this use case.

My comment was a combination of both. At a high level, I am trying to figure out if there's a way to rephrase the proposal as purely an enrichment of the JS API (and similarly how other hosts would interact with wasm modules). The exercise helps assess whether wasm truly needs to be changed and helps determine if really the proposal is secretly adding new primitives to JS that JS people may or may not approve of. That is, if it's not possible to do with just an imported await : func (param externref) (result externref), then it's quite likely that this is adding new functionality to JS.

As for the simplicity of the changes to wasm, there are still many things to consider like what to do about module to module calls, what to do when exported functions return GC values that contain pointers to functions that can execute await after the call finishes, and so on.

Returning to the exercise, as you pointed out there are good reasons to only capture the wasm stack. This brings me back to my earlier suggestion, though slightly revised with some new perspective. Add a WebAssembly.instantiateAsync(moduleBytes, imports, "name1", "name2") function to the JS API. Suppose moduleBytes has a number of imports plus an additional import import "name1" "name2" (func (param externref) (result externref)). Then instantiateAsync instantiates the other imports of moduleBytes simply with the values given by imports and instantiates the additional import with what is conceptually await-for-promise. When exported functions are created from this instance, they get guarded (conceptually by promise-on-await) so that when this await-for-promise is called it walks up the stack to find the first guard and then copies the contents of the stack over into a new Promise that is then immediately returned. Now we have the same primitives as I mentioned above, but they are no longer first class, and this restricted pattern ensures that only wasm stack will ever by captured. At the same time, WebAssembly does not need to be changed to support the pattern.

Thoughts?

@devsnek

I'm still curious why resumable exceptions (roughly effects) wouldn't fulfill this use case.

They are an option in this space, sure.

My understanding from @rossberg 's last presentation is that he initially wanted to go down that route, but then changed direction to do a coroutine approach. See the slide with title "Problems". After that slide, coroutines are described, which are another option in this space. So maybe your question is more for @rossberg who can perhaps clarify?

This proposal is focused on solving the sync/async problem which does not require as much power as resumable exceptions or coroutines. Those focus on internal interactions inside a wasm module, while we are focused on interaction between a wasm module and the outside (because that's where the sync/async problem happens). That's why we just need a single new instruction in the core wasm spec, and almost all the logic in this proposal is in the wasm JS spec. And it means that you can wait on a Promise like this:

call $get_promise
await
;; use it!

That simplicity in the wasm is useful for itself but also means it's very clear for the VM what's going on which may have benefits too.

@RossTate

That is, if it's not possible to do with just an imported await : func (param externref) (result externref), then it's quite likely that this is adding new functionality to JS.

I don't follow that inference, sorry. But it seems roundabout to me. If you think this proposal adds new functionality to JS, why not show that directly? (I strongly believe it does not, but am curious if you find we made a mistake!)

As for the simplicity of the changes to wasm, there are still many things to consider like what to do about module to module calls

Does the core wasm spec say anything about module to module calls? I don't remember it doing so, and skimming relevant sections now I don't see that. But perhaps I missed something?

My belief is that the core wasm spec additions would be basically to list await, say it is meant to "wait for something", and that's it. That's why I wrote That's it for the core wasm spec! in the proposal. If I'm wrong, please show me in the core wasm spec where we'd need to add more.

Let's speculate and say that some day the core wasm spec will have a new instruction to create a wasm module and call a method on it. In that case, I imagine we'd say await just traps because the point of it is to wait for something on the outside, on the host.

This brings me back to my earlier suggestion, though slightly revised with some new perspective [new idea]

Is that idea not functionally the same as the second paragraph in Alternative approaches considered in the proposal? Such a thing can be done, but we explained why we think it's less good.

@kripken got it. to be clear i think await solves the use cases presented in a very practical and elegant way. i'm just also kind of hoping we can maybe use this momentum to solve other use cases as well by broadening the design a bit.

I think that @RossTate's suggestion does indeed sounds very much like what is mentioned in "Alternative approaches considered". So I think we should discuss in more detail why that appoach was dismissed. I think we can all agree that a solution that involved no wasm spec changes would be preferable, if we can make the JS-side workable. I'm trying to understand the downsides you lay out in that section, and why they make the JS-only solution so unacceptable.

I think we can all agree that a solution that involved no wasm spec changes would be preferable

No! See the non-Web use cases discussed here. Without await in the wasm spec, we'd end up with each platform doing something ad-hoc: the JS environment does some import thing, other places create new APIs marked "synchronous", etc. The wasm ecosystem would be less consistent, it would be harder to move a wasm from the Web to other places, etc.

But yes, we should make the core wasm spec part as simple as possible. I think this does that? 99% of the logic is on the JS side (but @RossTate appears to disagree, and we're still trying to figure that out - I asked concrete questions in my last response that I hope will advance things).

My belief is that the core wasm spec additions would be basically to list await, say it is meant to "wait for something", and that's it.

Unless these semantics can be formalized more precisely, this introduces ambiguity or implementation-defined behavior into the spec. We have so far avoided that (at significant cost in the case of SIMD), so this is definitely something I would like to see pinned down. I don't think the proposal itself has to change to make this more formal, but "wait for something" should be reworded in the precise terminology already used by the spec.

Does the core wasm spec say anything about module to module calls?

An instance's imports can be instantiated with another instance's exports. From what I understand of the JS API (and of wasm's compositionality principle), a call to such an import is conceptually a direct call to whatever function the other instance exported. The same goes for (indirect) calls on functional values like funcref that get passed between the two instances.

Let's speculate and say that some day the core wasm spec will have a new instruction to create a wasm module and call a method on it. In that case, I imagine we'd say await just traps because the point of it is to wait for something on the outside, on the host.

Based on the module-compositionality principle discussed at the in-person meeting, it shouldn't trap. It should be as if there were just one (composed) module instance and it executed await. That is, await would pack up the stack up to the most recent JS stack frame.

Note that this implies that if f were the value of an exported unary function of some wasm instance, then the instantiation-parameters object {"some" : {"import" : f}} would be semantically different than {"some" : {"import" : (x) => f(x)}} because calls to the former will stay within the wasm stack whereas calls to the latter will enter the JS stack, even though just barely. So far these instantiation-parameter objects would be considered equivalent. I can go into why that's useful from a code-migration/language-interop standpoint, but that'd be a digression at the moment.

Is that idea not functionally the same as the second paragraph in Alternative approaches considered in the proposal? Such a thing can be done, but we explained why we think it's less good.

Sorry, I read that alternative as meaning something different, but that doesn't matter now except to explain my confusion. Seems like you meant the same as my suggestion, in which case it's worth discussing the pros and cons.

The fact that this proposal is so light on the wasm side is because the await instruction seems to be semantically identical to a call to an imported function. Of course, conventions matter, as you point out! But await is not the only functionality for which this holds; the same is true for most imported functions. In the case of await, my sense is that the concern about convention could be addressed by having modules with this functionality have an import "control" "await" (func (param externref) (result externref)) clause, and to have environments that support this functionality always instantiate that import with the appropriate callback.

That seems to give a solution that saves a ton of work by not changing wasm while still providing the cross-platform portability that you're looking for. But I'm still working to understand the nuances of the proposal, and I've already missed a bunch so far!

The fact that this proposal is so light on the wasm side is because the await instruction seems to be semantically identical to a call to an imported function.

FWIW this is where this proposal originally started, but using intrinsics like that seems more opaque to the VMs and generally discouraged (I think @binji suggested moving away from it in original discussions).

For example, following your argument, something like memory.grow or atomic.wait could also be done as import "control" "memory_grow" or import "control" "atomic_wait" correspondingly, but they're not as they don't provide same level of interop and static analysis opportunities (both on the VM and the tooling side) as real instruction.

You could argue that memory.grow as an instruction is still useful for cases where memory is not exported, but atomic.wait definitely could be implemented outside the core. In fact, it's very similar to await, except for the level at which pause/resume occurs and for the fact that await as a function would require way more magic than atomic.wait since it needs to be able to interact with the VM stack and not just block the current thread until a value changes.

@tlively

"wait for something" should be reworded in the precise terminology already used by the spec.

Definitely, yes. I can suggest some more specific text now if that would be helpful:

When an await instruction is executed on a waitref, the host environment is requested to do some work. Typically there would be a natural meaning to what that work is based on what a waitref is on a specific host (in particular, waiting for some form of host event), but from the wasm module's point of view, the semantics of an await are similar to a call to an imported host function, that is: we don't know exactly what the host will do, but at least expect to give it certain types and receive certain results; after the instruction executes, global state (the store) may change; and an exception may be thrown.

The behavior of an await from the host's perspective may be very different, however, from a call to an imported host function, and might involve something like pausing and resuming the wasm module. It is for this reason that this instruction is defined. For the instruction to be usable on a particlar host, the host would need to define the proper behavior.

Btw, another comparison that came to me while writing this are the alignment hints on loads and stores. Wasm supports unaligned loads and stores, so the hints cannot lead to different behavior observable by the wasm module (even if the hint is wrong), but for the host they suggest a very different implementation on certain platforms (which may be more efficient). So that is an example of different instructions without internally-observable different semantics, as the spec says: The alignment in load and store instructions does not affect the semantics.

@RossTate

Based on the module-compositionality principle discussed at the in-person meeting, it shouldn't trap. It should be as if there were just one (composed) module instance and it executed await. That is, await would pack up the stack up to the most recent JS stack frame.

Sounds good, and good to know, thanks, I missed that part.

I think this explains to me part of our misunderstanding. Module => module calls are not in the wasm spec atm, which was my point earlier. But it sounds like you are thinking forward to a future spec where they might be. In any case, that doesn't look like a problem here, as compositionality determines exactly how an await should behave in that situation (which is not what I suggested earlier! but makes more sense).

Does the core wasm spec say anything about module to module calls? I don't remember it doing so, and skimming relevant sections now I don't see that. But perhaps I missed something?

Yes, the core wasm spec distinguishes between functions that have been imported from other wasm modules and host functions (§ 4.2.6). The semantics of function invocation (§ 4.4.7) do not depend on the module that defined the function, and in particular cross-module function calls are currently specified to behave identically to same-module function calls.

If awaits beneath cross-module calls are defined to trap, this would then require specifying a traversal up the call stack to inspect whether a cross-module call exists before the last dummy frame created by a invocation from the host (§ 4.5.5). This would be an unfortunate complication in the spec. But I agree with Ross that having cross-module calls trap would be a violation of compositionality, so I would prefer the semantics where the entire stack is frozen back to the last invocation from the host. The simplest way to spec that would be to make await similar to a host function invocation (§ 4.4.7.3), as you say, @kripken. But host function invocations are completely nondeterministic, so then a better name for the instruction from the core spec point of view might be undefined. And at this point I actually start preferring an intrinsic import that will always be provided by the Web platform (and WASI for portability) because the core spec, on its own, does not benefit from having an undefined instruction IMO.

Semantically, a call to the host environment that returns a waitref plus an await is just a blocking call, right?

What value does this provide to non-web embeddings that don't have an asynchronous environment like a browser does and can natively support blocking calls?

@RReverser, I see the point you're making about intrinsics. There is a judgement call involved in decided when an operation should be defined through uninterpreted functions versus instructions. I think one factor in this judgement is to consider how it interacts with other instructions. memory.grow affects the behavior of other memory instructions. I haven't had a chance to peruse the Threads proposal, but I imagine atomic.wait affects or is affected by the behavior of other synchronization instructions. The spec then has to be updated to formalize these interactions.

But with await all by itself, there aren't any interactions with other instructions. The only interactions are with the host, which is why my intuition would be that this proposal should be done through imported host functions.

I think a big difference between atomic.wait and this proposed await is that the module cannot be re-entered with atomic.wait. The agent is suspended in its entirety.

@kripken:

My understanding from @rossberg 's last presentation is that he initially wanted to go down that route, but then changed direction to do a coroutine approach. See the slide with title "Problems". After that slide, coroutines are described, which are another option in this space. So maybe your question is more for @rossberg who can perhaps clarify?

Yes, so the coroutine-ish factorisation can be thought of as a generalisation of the previous resumable-exceptions design. It still has the same notion of resumable events/exceptions, but the try instruction is decomposed into smaller primitives -- which makes the semantics simpler and the cost model more explicit. It also is somewhat more expressive.

The intent still is that this can express all relevant control abstractions, and async is one of the motivating use cases. To interop with JS async, the JS API could presumably provide a predefined await event (carrying a JS promise as an externref) that a Wasm module could import and throw to suspend. Of course, there are a lot of details that would have to be fleshed out, but in principle that should be possible.

As for the current proposal, I'm still trying to wrap my head around it. :)

In particular, it seems to allow await in any old Wasm function, am I reading that correctly? If so, that is very different from JS, which allows await only in async functions. And that's a very central constraint, because it enables engines to compile await by _local_ transformation of a single (async) function!

Without that constraint, engines would either need to perform a _global_ program transformation (like supposedly Asyncify does), where every call would potentially become much more expensive (you cannot generally know whether some call might reach an await). Or, equivalently, engines would need to be able to create multiple stacks and switch between them!

Now, this is exactly the feature that the coroutine / effect handlers idea tries to introduce into Wasm. But obviously, it is a highly non-trivial addition to the platform and its execution model, a complication that JS has been very careful to avoid for its control abstractions (such as async and generators).

@rossberg

In particular, it seems to allow await in any old Wasm function, am I reading that correctly? If so, that is very different from JS, which allows await only in async functions.

Yes, the model here is very different. JS await is by function, while this proposal does an await of an entire wasm instance (because the goal is to solve the sync/async mismatch between JS and wasm, which is between JS and wasm). Also JS await is for handwritten code, while this is to enable porting of compiled code.

And that's a very central constraint, because it enables engines to compile await by local transformation of a single (async) function! Without that constraint, engines would either need to perform a global program transformation (like supposedly Asyncify does), where every call would potentially become much more expensive (you cannot generally know whether some call might reach an await). Or, equivalently, engines would need to be able to create multiple stacks and switch between them!

Definitely a global program transformation is not intended here! Sorry if that wasn't clear.

As mentioned in the proposal, switching between stacks is one possible implementation option, but note that it isn't the same as coroutine style stack switching:

  • Only the entire wasm instance can pause. This isn't for stack switching inside the module. (In particular, that's why this proposal could have no additions to the core wasm spec and be entirely on the wasm JS side; so far some people prefer that, and I think either way can work.)
  • Coroutines declare stacks explicitly, await does not.
  • await stacks can only be resumed once, there is no forking / returning more than once (not sure if you'll have that in your proposal or not?).
  • The performance model is very different here. await is going to wait on a Promise in JS, which has minimal overhead & latency already. So it's ok if the implementation has some overhead when we actually pause, and we care less than coroutines probably would.

Given those factors, and that the observable behavior of this proposal is that an entire wasm instance pauses, there may be various ways to implement it. For example, off the Web in a VM running a single wasm instance, it could literally just run its event loop until it's time to resume the wasm. On the Web, one implementation approach might be: when an await occurs, copy the entire wasm stack, from the current position to where we called into the wasm; save that on the side; to resume, copy it back in and just proceed from there. There may also be other approaches or variations on these (some perhaps without copying, but again, avoiding copy overhead is not actually crucial here!).

Sorry for the long post, and some repetition from the proposal text itself, but I hope this helps clarify some of the points you referred to?

I think there is a lot to discuss here in terms of implementation. So far @acfoltzer 's comment about Lucet is encouraging!

Just to clarify some phrasing in @kripken's most recent comment, it is not the entire wasm instance that pauses. It is just the most recent call from a host frame into wasm on the stack that is paused, and then the host frame is instead returned a corresponding promise (or the appropriate analog for the host). See here for the relevant earlier clarification.

Hm, I don't see how that makes a difference. When you await somewhere deep inside Wasm you'll need to capture all of the call stack from at least host entry to that point. And you can keep that suspension (i.e., that stack segment) alive for as long as you want, while making other calls from above or creating more suspensions. And you can resume from somewhere else (I think?). Doesn't that require all the implementation machinery of delimited continuations? Only that the prompt is set upon Wasm entry instead of by a separate construct.

@rossberg

That might be true on some VMs, yes. If await and coroutines end up needing the exact same VM work, then at least no extra work is needed. In that case, the benefit of the await proposal would be the convenient JS integration.

I think you can get convenient JS integration without whole program transformation if you don't allow the module to be re-entered.

I think you can get convenient JS integration without whole program transformation if you don't allow the module to be re-entered.

This sounds a easier way to get it done, but that would require blocking any module visited in the call stack (or as first step, all WebAssembly modules).

This sounds a easier way to get it done, but that would require blocking any module visited in the call stack (or as first step, all WebAssembly modules).

Correct, just like atomic.wait.

@taralx

I think you can get convenient JS integration without whole program transformation if you don't allow the module to be re-entered.

On the one hand re-entry can be useful, for example, a game engine may download a file and not want the UI to be completely paused while doing so (Asyncify allows that today). But on the other hand, maybe re-entry could be disallowed but an application could create multiple instances of the same module for that (all importing the same memory, mutable globals, etc.?), so a re-entry would be a call to another instance. I think we can make that work in toolchains (there would be an effective limit on the number of re-entries active at once - equal to the number of instances - which seems fine).

So if your simplification would help VMs it's definitely worth considering!

(Note though that as discussed earlier I don't think we need a whole program transformation here with any of the options being discussed. You only need that if you are in the bad situation Asyncify is in, where it's all you can do at the toolchain level. For await, in the worst case as discussed with @rossberg you can do what the coroutines proposal would do internally. But your idea is potentially very interesting if it makes things simpler than that!)

On the one hand re-entry can be useful, for example, a game engine may download a file and not want the UI to be completely paused while doing so (Asyncify allows that today).

I am not sure this is a sound feature though. It seems to me that this would introduce _unexpected concurrency_ in the application though. A native application which load assets while rendering would use 2 threads internally, and each thread would map to a WebWorker + SharedArrayBuffer. If an application uses threads then it could also use synchronous Web primitives from WebWorkers (as they are allowed, at least in some cases). Otherwise it is always possible to map async operations in the main thread to blocking operations in a worker using Atomics.wait (for example).

I wonder if the whole use case isn't already solved by multithreading in general. By using blocking primitives in a worker the whole stack (JS/Wasm/browser native) is preserved which seems to be much simpler and robust.

By using blocking primitives in a worker the whole stack (JS/Wasm/browser native) is preserved which seems to be much simpler and robust.

That's actually another alternative implementation of the standalone Asyncify JS wrapper I've experimented with, but, while it solves the code size problem, the performance overhead was even much higher than the current Asyncify that uses Wasm transformation.

@alexp-sssup

It seems to me that this would introduce unexpected concurrency in the application though.

Definitely, yes - it needs to be done very carefully, and can break things. We have mixed experience with this using Asyncify, good and bad (for an example valid use case: a file is downloaded in JS, and JS calls into wasm to malloc some space into which to copy it, before resuming). But in any case, re-entry is not a crucial part of this proposal either way.

To add to what @RReverser said, another issue with threads is that support for them is not and will not be universal. But await could be everywhere.

In other languages where async/await have been introduced, re-entry is absolutely key. Its kind of the whole point that other events can happen while one is (a)waiting. Its seems to me that re-entry is pretty important.

Furthermore, isn't it true that whenever a module makes any call to an external function it has to assume that it could be re-entered via any of its exports (in the above example, even without any awaiting, any call to and external function is free (no pun intended) to call malloc).

an application could create multiple instances of the same module for that (all importing the same memory, mutable globals, etc.?), so a re-entry would be a call to another instance

Only for the module's shared memories. The other memories have to be re-instantiated, which is important for avoiding having one operation stomp on another operations in-flight changes.

I note that the non-reentrant version of this is polyfillable on any embedding with thread support, in case someone wanted to play with it and see how useful it is.

I note that the non-reentrant version of this is polyfillable on any embedding with thread support, in case someone wanted to play with it and see how useful it is.

As mentioned above, that's something we already played with, but discarded as it brings even worse performance than the current solution, is not universally supported, and additionally makes it very hard to share WebAssembly.Global or WebAssembly.Table with the main thread without additional hacks, making it a poor choice for a transparent polyfill.

Current solution that rewrites Wasm module doesn't suffer from these issues, but instead has significant file size cost.

As such, neither of these is great for large real-world apps, which motivates us to look into native support for asynchronous integration like described here.

worse performance

Do you have some kind of benchmark?

Yeah, I can share it when I'm back to work on Tuesday (or, more likely, Wednesday), or it's fairly easy to whip one up that just calls to en empty async JS function yourself.

Thanks. I could create a microbenchmark, but it wouldn't be terribly instructive.

Oh yeah, mine is a microbenchmark as well since we were interested purely in overhead comparison.

The problem with a microbenchmark is that we don't know how much latency is acceptable to a real application. If it takes an additional 1ms, is that really a problem if the application only performs await operations at 1/s rate, for example?

I think the focus on the speed of an atomics-based approach may be a distraction. As mentioned earlier atomics do not and will not work everywhere (due to COOP/COEP) and also only a worker could use the atomics approach as the main thread can't block. It's a neat idea, but for a universal solution we need something like Await.

I'm not suggesting it as a long-term solution. I'm suggesting a polyfill that uses it could be used to see if a non-reentrant solution will work for people.

@taralx Oh, ok, now I see, thanks.

@taralx:

I think you can get convenient JS integration without whole program transformation if you don't allow the module to be re-entered.

That would be bad. It means that merging multiple modules could break their behaviour. That would be the antithesis to modularity.

As a general design principle, operational behaviour should never be dependent on module boundaries (other than simple scoping). Modules are merely a grouping and scoping mechanism in Wasm, and you want to maintain the ability to regroup stuff (link/merge/split modules) without that changing the behaviour of a program.

@rossberg: this is generalizable as blocking access to any Wasm module, as proposed earlier. But then it's probably too limiting.

That would be bad. It means that merging multiple modules could break their behaviour. That would be the antithesis to modularity.

That was my point with the polyfilling argument - atomic.wait doesn't break modularity, so this shouldn't either.

@taralx, atomic.wait references a specific location in a specific memory. Which memory and location would await blocking use, and how would one control which modules share that memory?

@rossberg can you elaborate on a scenario you think this breaks? I suspect we have different ideas on how the non-reentrant version would work.

@taralx, consider loading two modules A and B, each providing some export function, say A.f and B.g. Both might perform await when called. Two pieces of client code are each passed one of these functions, respectively, and they call them independently. They don't interfere or block one another. Then somebody merges or refactors A and B into C, without changing anything about the code. Suddenly both pieces of client code could start blocking each other unexpectedly. Spooky action at a distance through hidden shared state.

That makes sense. But allowing re-entry risks concurrency in modules that don't expect it, so it's spooky action at a distance either way.

But modules are already re-enter-able, no? Whenever a module makes a call an import, the external code can re-enter the module which could change global state before returning. I can't see how re-entry during the proposed await is any more spooky or concurrent than calling an imported function. Maybe I'm missing something?

(edited)

Hm, yes. Okay, so an imported function could re-enter the module. I clearly need to think harder about this.

When code is running, and it calls a function, there are two possibilities: It knows that the function will not call random things, or the function might call random things. In the latter case, re-entrancy is always possible. The same rules apply to await.

(edited my comment above)

Thanks everyone for the discussion so far!

To summarize, it sounds like there is general interest here, but there are big open questions like whether this should be 100% on the JS side or just 99% - sounds like the former would remove the major worries some people have, and that would be fine for the Web case, so that is probably ok. Another big open question is how feasible this would be to do in VMs which we need more info about.

I'll suggest an agenda item for the next CG meeting in 2 weeks to discuss this proposal and consider it for stage 1, which would mean opening a repo and discussing the open questions in separate issues in more detail there. (I believe that's the right process, but please correct me if I'm wrong.)

Just for FYI
We will be putting together a full stack switching proposal in a similar
time frame. I feel that that might make your special case variant moot -
what do you think?
Francis

On Thu, May 28, 2020 at 3:51 PM Alon Zakai notifications@github.com wrote:

Thanks everyone for the discussion so far!

To summarize, it sounds like there is general interest here, but there are
big open questions like whether this should be 100% on the JS side or just
99% - sounds like the former would remove the major worries some people
have, and that would be fine for the Web case, so that is probably ok.
Another big open question is how feasible this would be to do in VMs which
we need more info about.

I'll suggest an agenda item for the next CG meeting in 2 weeks to discuss
this proposal and consider it for stage 1, which would mean opening a repo
and discussing the open questions in separate issues in more detail there.
(I believe that's the right process, but please correct me if I'm wrong.)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/WebAssembly/design/issues/1345#issuecomment-635649331,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAQAXUCLZ4CJVQYEUBK23BLRT3TFLANCNFSM4NEJW2PQ
.

>

Francis McCabe
SWE

@fgmccabe

We should discuss that for sure.

In general though, unless your proposal focuses on the JS side, I'm guessing it wouldn't make this one moot (which is 99%-100% on the JS side).

Now that discussion on implementation details has concluded, I would like to reraise a higher-level concern I expressed earlier but dropped for the sake of having one discussion at a time.

A program is made up of many components. From a software-engineering perspective, it is important that splitting components into parts or merging components together does not significantly change the behavior of the program. This is the reasoning behind the module-composition principle discussed at the last in-person CG meeting, and it's implicit in the design of many languages.

In the case of web programs, now with WebAssembly these different components might even be written in different languages: JS or wasm. In fact, many components could just as well be written in either language; I'll refer to these as "ambivalent" components. Right now, most ambivalent components are written in JS, but I imagine we're all hoping that more and more of them will be rewritten into wasm. To facilitate this "code migration", we should try to ensure that rewriting a component in this fashion does not change how it interacts with the environment. As a toy example, whether a particular "apply" program component (f, x) => f(x) is written in JS or in wasm should not affect the behavior of the overall program. This is a code-migration principle.

Unfortunately, all of the variants of this proposal seem to violate either the module-composition program or the code-migration principle. The former is violated when await captures the stack up to where the current wasm module was most recently entered, because this boundary changes as modules are split apart or combined together. The latter is violated when await captures the stack up to where wasm was most recently entered, because this boundary changes as code is migrated from JS to wasm (so that migrating something as simple as (f, x) => f(x) from JS to wasm can significantly change the behavior of the overall program).

I don't think these violations are due to poor design choices of this proposal. Rather, the problem seems to be that this proposal is trying to avoid indirectly making JS any more powerful, and that goal is forcing it to impose artificial boundaries that violate these principles. I totally understand that goal, but I suspect this problem will come up more and more: adding functionality to WebAssembly in a manner that respects these principles will often require indirectly adding functionality to JS due to JS being the embedding language. My preference would be to tackle that issue head on (which I really have no idea how to resolve). If not that, then my secondary preference would be to make this change solely in the JS API, because it is JS that is the limiting factor here, rather than add instructions to WebAssembly that wasm has no interpretation for.

I don't think these violations are due to poor design choices of this proposal. Rather, the problem seems to be that this proposal is trying to avoid indirectly making JS any more powerful

That is important, but it is not the main reason for the design here.

The main reason for this design is that while I fully agree that the principle of composition makes sense for wasm, the fundamental problem we have on the Web is that in fact JS and wasm are not equivalent in practice. We have handwritten JS that is async and ported wasm that is sync. In other words, the boundary between them is actually the exact problem we are trying to address. Overall I am not sure I agree the principle of composition should be applied to wasm and JS (but maybe it should, could be an interesting debate).

I was hoping to have more discussion publicly here, but to save time I reached out to some VM implementers directly, as few have engaged here so far. Given their feedback together with the discussion here, sadly I think we should pause this proposal.

Await has much simpler observable behavior than general coroutines or stack switching, but the VM people I talked to agree with @rossberg that the VM work in the end would probably be similar for both. And at least some VM people believe we will get coroutines or stack switching anyhow, and that we can support await's use cases using that. That will mean creating a new coroutine/stack on each call into the wasm (unlike with this proposal), but at least some VM people think that could be made fast enough.

In addition to the lack of interest from VM people, we have had some strong objections to this proposal here from @fgmccabe and @RossTate , as discussed above. We disagree on some things but I appreciate those points of view, and the time that went into explaining them.

In conclusion, overall it feels like it would be a waste of everyone's time to try to move forward here. But thank you to everyone that participated in the discussion! And hopefully at least this motivates prioritizing coroutines / stack switching.

Note that the JS part of this proposal may be relevant in the future, as JS sugar basically for convenient Promise integration. We'll need to wait for stack switching or coroutines and see if this could work on top of that. But I don't think it's worth keeping the issue open for that, so closing.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aaabbbcccddd00001111 picture aaabbbcccddd00001111  ·  3Comments

badumt55 picture badumt55  ·  8Comments

artem-v-shamsutdinov picture artem-v-shamsutdinov  ·  6Comments

chicoxyzzy picture chicoxyzzy  ·  5Comments

ghost picture ghost  ·  7Comments