Design: Proposal: Import resolution syntax for native wasm hosts

Created on 19 Jun 2019  ·  25Comments  ·  Source: WebAssembly/design

I would like to propose a syntax and resolution mechanism for import instructions in native wasm hosts (so, for now, mostly WASI and maybe node.js).

Rationale

The model I'm aspiring towards is what @lukewagner has called shared-nothing linking; which I'll call trustless composition in this proposal.

In this model, wasm developers can pull from an npm-like ecosystem of modules; each module is self-contained with its own dependencies, and generally doesn't on globally-installed packages. This is also the model picked by snapcraft and flatpack; its main benefit is that every installation of a program in this model is guaranteed to be the same, which is immensely helpful for bug reporting and for avoiding dependency hell problems.

(seriously, dependency hell is awful; the convenience of an import system that can mitigate these problems can't be understated)

A wasm version of this model should also be safe, even in the face of malicious code, hence the name "trustless composition". A native import resolution system should respect the basic principle of object capabilities: untrusted code can only access data that has been explicitly passed to that code. In other words, sub-sub-sub-subdependencies shouldn't be able to import /usr/bin/someprogram.wasm even if the host has filesystem permissions to that file.

Now, importing a wasm module directly from another wasm module is currently mostly impossible. Using wasm modules together requires some glue code to be written in a JS embedder. There's an upcoming esm integration proposal, but:

  • The proposal is aimed for JS embeddings, and makes assumptions based on the ES environment.

  • The proposal doesn't specify a pathname resolution mechanism; assuming that the final mechanism is implementation-dependent, node.js's pathname resolution doesn't quite pass the "trustless" requirement of wasm.

In short, wasm needs a standard, safe, non-JS mechanism to import modules from other modules. While this standard might not be exclusively used by WASI, WASI covers the majority of use cases.

Proposal

During the instantiation phase (eg WebAssembly.instantiate, wasm_instance_new and wasm::Instance::make), the WebAssembly compiler takes an optional filesystem root object, and an optional import map object.

The filesystem root will usually be the directory the .wasm file is in. The import map object is platform-dependent.

When parsing an import string, the compiler applies the following algorithm:

  • Check that the path is a valid URI path (or some other scheme).
  • Check that the path is relative, and doesn't include . or .. segments.
  • If the first segment is prefixed with @, map it to the path defined in the host import map.
  • If the first segment is prefixed with $, check the current folder, and all parent folders up to the import root, for wasi_import_map.json files; return the first time a map has an entry matching the segment.

Host imports (@)

This syntax is used to import host functionality, eg @wasi/fs or @stdjs/global/Array.

It has the benefit of not colliding with existing JS import conventions (as long as "wasi" and "stdjs" are reserved as NPM orgs).

Mapped imports ($)

This syntax is used to import data out of the current directory, in an OCAP-safe way.

The fundamental principle is that, by default, a .wasm file only has access to files in its current directory, and subdirectories; however, a directory can have a wasi_import_map.json file, which declares hooks that .wasm files in subdirectories can access.

For instance:

myWasmModule/
    index.wasm
    foo/
        wasi_import_map.json
        foo.wasm
        foobar/
            foobar.wasm
    bar/
        bar.wasm

In the above example:

  • index.wasm can import foo.wasm, foobar.wasm and bar.wasm.
  • foo.wasm and bar.wasm cannot import each other or index.wasm.
  • foobar.wasm can import foo.wasm iff foo/wasi_import_map.json has a key to that path.
  • foo/wasi_import_map.json cannot give foo.wasm or foobar.wasm access to index.wasm or bar.wasm.

Import mapping has a few use cases:

  • Allowing a package manager to apply dependency deduping to reduce compile times and improve code cache size.

  • Provide shims and polyfills.

  • Use different libraries for debug vs release builds.

Why not use WICG's import maps?

There's an argument to be made that it would be simler to reuse WICG's import maps proposal (from now on WIM), and that having special cases for $ and @ is extraneous complexity.

Some counterpoints:

  • WIM solves a different problem than WASI import maps. WIM is based on the principle that making a fetch call at a static address is generally safe, and cannot leak data. This proposal instead relies on the OCAP model of safety.

  • Having a separate syntax for normal, host and mapped imports allows developer and tools to communicate their intent specifically; it also encourages more readable error messages.

Web use case

Note that this proposal is intended for native wasm interpreters. How browser resolve import instructions is completely independent from this proposal.

However, if a package manager ecosystem were to develop around this import structure, bundlers like Webpack and Rollup could start providing a browser-friendly version of that structure like they currently do with npm.


Anyway, that's my proposal. Any feedback welcome :)

Most helpful comment

I've been feeling the need for something like this too. However, I feel like we're not yet equipped to specify anything without introducing some fundamentally new, topologically "bigger" concept (that contains multiple modules) such as "package" (unit of code in a package manager like npm) or "container" (unit of code that can be executed by a container runtime) because, as pointed out above, we can't assume some ambient filesystem exists whose paths we can talk about. Once you have a concept of "package" or "container" (both of which I think make sense, for different phases of deployment), then it's much easier to be concrete about how resolution works.

I also share @littledan's goal that, whatever gets defined should have the same interpretation inside and outside of a JS/ESM environment. That way you can have a single package of wasm modules that can be used just as well in Node as outside.

All 25 comments

what if instead of having a specific syntax, we just recommend that wasi implementations apply libpreload style capability rules to import resolution? if you have a.wasm and b.wasm you can declare . or ./b.wasm in your wasi manifest so you can import it from a.wasm

this would make wasi way more compatible with other random systems it can be joined with, such as browsers and node.

I like the general direction of this proposal. Preventing libs from arbitrarily grabbing imports seems important. I also like that this proposal is fairly lightweight. A couple of thoughts, though.

  • Your description of import maps is a bit incomplete. Is the idea that a map entry a: b/c would allow importing b/c under the name $a? If so, I'm not sure how that allows shimming or remapping libs, since the importing module needs to be aware that the name is (re)mapped. Why is the $ prefix needed?

  • It would be good to treat directories in a more module-like fashion, where each directory can also restrict the files it "exports" outwards to the surrounding. Such "export maps" would be in addition to the import maps that define the environments facing inwards. If an export map is given, only the files/paths listed therein may be accessed from modules or import/export maps in surrounding or sibling directories.

  • I strongly agree with using standard URI references as the naming mechanism. However, the @ or $ prefixes are abusing the notion of relative reference a little (@a/b is a relative reference in URI terms, whereas the semantics it is assigned here is very much an absolute form of ref).

  • Bikeshedding: Is there much benefit from using JSON for import maps? As you describe them, they merely consist of a list of path pairs, so a complex and muddled format like JSON may be overkill for this.

If so, I'm not sure how that allows shimming or remapping libs

I was thinking about dependency injection, where (for instance) two modules need to use the same instance of react can import $react and rely on the project root to provide the mappping.

Though, now that I think about it, that does bring up the question of what to do when two files/modules import the same file/module. Should they create two separate instances, or one shared instance? Both answers have their use cases.

I was also thinking there could be a mechanism to have multiple wasm_import_map files for debug, release, unit tests, etc, but that seems like premature feature creep. Wasm doesn't really need to become Cmake.

Why is the $ prefix needed?

It allows for clearer error messaging, and faster diagnostic of problems.

Cannot find host module "@foobar"

vs

Cannot find file "some/path/foobar.wasm"

vs

Key "$foobar" not found in imports maps:
- "some/path/wasm_import_map.json"
- "some/wasm_import_map.json"

Not having prefixes means you either have long errror messages listing all possibilities, or, more likely, you get the same vague import errors as C compilers that say "Cannot resolve foobar" and leave you to guess why.

It would be good to treat directories in a more module-like fashion, where each directory can also restrict the files it "exports" outwards to the surrounding.

I don't think export maps would be that useful. A module that wants to export, say, a "bindings" submodule and a "network" submodule could just have bindings.wasm and network.wasm files at the root.

A mechanism to restrict importing from parent directories is necessary for filesystem safety, but you don't need to (and pretty much can't) restrict importing from child directories.

Is there much benefit from using JSON for import maps?

Not really. I was thinking about protobuff too, but really, any format could do. There just needs to be one or two default formats.

I was thinking about dependency injection, where (for instance) two modules need to use the same instance of react can import $react and rely on the project root to provide the mappping.

You can indicate parameters via $ via naming convention, that doesn't imply any need to bake that in as a syntactic requirement. Obviously, not all uses would be parameterisation (a.k.a. "dependency injection", as some confused souls renamed it :) ). For shimming, I think you'll need to be able to remap arbitrary modules, including system ones (i.e., names starting with @ in what you describe).

Error messages seem like a solvable problem. Especially since the system you describe is simple enough, i.e., the name can only be found in any of the parent dirs or a map therein.

I don't think export maps would be that useful. A module that wants to export, say, a "bindings" submodule and a "network" submodule could just have bindings.wasm and network.wasm files at the root.

Would that prevent anybody from overriding the intention by import-mapping modules in nested dirs?

Error messages seem like a solvable problem.

Well yeah, they are for #include too, and yet troubleshooting a failed library include can be surprisingly difficult.

Either way, I think there's inherent value in forcing the user to specify their intent. It creates a pit of success for tool designers, who know what the user wants unambiguously; and it allows users to more easily intuit how the toolchain works.

For shimming, I think you'll need to be able to remap arbitrary modules, including system ones (i.e., names starting with @ in what you describe).

Re arbitrary modules, I specifically wrote the proposal to avoid that. If you make it possible to shim any import the way WICG import maps allow you to, you make the underlying design less robust, less KISS, and you open yourself to a lot of name shadowing errors / exploits.

(that's not a criticism of WICG import maps; like I said, they address different use cases)

I think dependency injection parameterisation is enough for most shimming use cases, and for more intrusive shimming you can always have a tool "manually" replace some files with others.

Re: system modules, I was thinking that shimming them would be done at the WebAssembly.instantiate API level, since the most common use case would be for unit tests.

The syntax could always be merged with $, but again, there's value in expressing intent.

(also, worst case scenario, we deprecate the @ and $ and make them an optional naming convention; the opposite progression breaks compability)

Would that prevent anybody from overriding the intention by import-mapping modules in nested dirs?

No, but I'm not sure that's a problem.

As far as overriding the module writer's intent, well, that's the module user's prerogative. If they want to muck in a module's internals, they do so at their own risk, and they could do it by forking the module anyway.

As far as sandbox safety goes, I guess it could be a problem if a malicious module imports some external module shared with other programs, in a way that breaks invariants (eg instead of importing /usr/bin/game_engine/save_progress, the malicious code imports /usr/bin/game_engine/save_progress/unsafe_filesystem_functions).

There are probably workarounds for that, though, and I don't think there's a lot of possible attack vectors anyway.

But I guess if I'm aiming for a "pit of success" and "safe sandboxing" approach, which is kind of the point of ocap and WASI, these need to be considered too.

the name can only be found in any of the parent dirs or a map therein.

A map therein, yes, any parent dirs, no.

Accessing files in parent directories breaks sandboxing and isn't allowed.

A .wasm file can only access:

  • .wasm files in their directory
  • .wasm files in subdirectories
  • wasm_import_map.json files in parent directories, up to the import root

Import maps themselves obey the same rules, and can only access wasm files in subdirectories and import maps in parents.

I think there's inherent value in forcing the user to specify their intent.

I very much agree, but in this case it seems to prematurely bar valid use cases associated with a different intent.

As far as overriding the module writer's intent, well, that's the module user's prerogative. If they want to muck in a module's internals, they do so at their own risk

Yeah... I have heard that argument before ;). Unfortunately, that's not how the game theory plays out in practice. If there is no way to explicitly express and (to some degree) enforce such intent then somebody will override it, because nothing is stopping them. And it only takes one popular client to do so and suddenly the module writer realises that all the burden is on them to retain the unofficial API or get all the fire from transitive customers for her/his update breaking their code. Been there, done that; it's a situation that a well-designed system should help avoiding.

The dual to gating imports is gating exports. Wasm modules intentionally support proper encapsulation via both, the same should be true one level up.

Accessing files in parent directories breaks sandboxing and isn't allowed.

Right, I meant the import maps in parent dirs only, the "or" was sloppy wording.

Been there, done that; it's a situation that a well-designed system should help avoiding.

Fair enough.

I very much agree, but in this case it seems to prematurely bar valid use cases associated with a different intent.

To be clear, is your main reservation regarding typed prefixes:

  • That they prevent specific use cases that a more general system would allow?
  • Or that they represent a design constraint, and therefore they might bar future use cases that we didn't anticipate?

I don't get the feeling it's the latter, but if it is, I have to object, because the design I'm suggesting isn't arbitrary, and covers problems that need to be addressed just as much as unanticipated future use cases.

If it's the former, do you have specific examples in mind, so that I can adjust my proposal? Obviously the wasm package ecosystem we want to design for doesn't exist yet, but even a vague "we have X inside directory Y, we want to shim Z" use case would help me.

To be clear, my reasons for including prefixes are:

  • Convey user intent unambiguously.
  • Avoid shadowing problems, eg a package breaks because it imports a file named streams.wasm and a package named stream has been added to the root project.
  • Have a reserved syntax for system libs, without worrying about existing package names.

I am curious if WASM/WASI should cover any module behaviours at all. My first thought is that they shouldn't, so that they can be used in lots of places regardless of module system. The way that wasmtime loads and links WASM modules is very very different from how something like node loads and links WASM modules, but ideally they should both "just work".

I wonder if it's possible to align between web and non-web environments on some of these details.

For example, for superficial details: Although I have advocated for @namespace/modulename, the web seems to be heading towards namespace:modulename for built-in modules, which seems fine to me, and like it should work for non-web envs as well.

More broadly, the Wasm/ESM integration proposal does use JS plumbing, but it does this to encode some things that make sense equally well outside of JS, e.g., that you can't have multiple Wasm modules that circularly import each other's function exports (with an error during Instantiation). I wonder if we could work to phrase the ESM integration spec to make these cross-environment semantics more clear.

The way that wasmtime loads and links WASM modules

I'm pretty sure wasmtime currently doesn't link WASM modules at all. It's not a documented feature, it empirically doesn't work in my tests, and looking at the source, the Resolver interface is only implemented with Namespace (which matches wasi_unstable and other hardcoded names) and NullResolver.

My first thought is that they shouldn't, so that they can be used in lots of places regardless of module system.

But that's not really how it works in practice. Saying "We should have very general standards, so that everyone can do what best suits their needs" often turns into "There's a dozen competing conventions / informal standards, none of them correctly address the problem space, and most tools only understand 2 or 3 of them".

Although I have advocated for @namespace/modulename, the web seems to be heading towards namespace:modulename for built-in modules,

Sure. Syntax aside, I'm more interested in the idea of differentiating filesystem import vs parameterized imports vs host imports.

More broadly, the Wasm/ESM integration proposal does use JS plumbing, but it does this to encode some things that make sense equally well outside of JS, e.g., that you can't have multiple Wasm modules that circularly import each other's function exports (with an error during Instantiation).

Yeah, I think this is the area where this proposal is the most lacking. I need to better define linking semantics, and make an issue on the esm-integration repo to discuss the overlap.

what I meant was more like, we should avoid specifying specific things like import syntaxes and resolution (having ocap rules and stuff like that is cool though). a system wishing to integrate wasm may already be using a certain syntax and resolution system that could clash with ours. it may not even have a concept of paths and files.

I don't have a simple use case off-hand. However, conceptually, there are only two kinds of module references: determinate ones, which refer to a fixed known module, typically part of the same component, and indeterminate ones, which are effectively parameters, and are resolved by the module's client. (Interestingly, Wasm itself only has the latter.)

From that perspective, it is not entirely clear to me what the third kind adds. I can see how it could be useful to have a customisable overlay mechanism that can emulate both other kinds. But that only makes sense if it can take both syntactic forms. OTOH, if it cannot overlay the others, then how is it semantically different from a package import -- i.e., what's the deeper difference between @ and $, and why is it beneficial to hardwire that difference?

Maybe your answer is that the difference is just in pragmatics of intended resolution, but then naming conventions would perhaps be enough.

Okay, I think I understand your question better now.

what's the deeper difference between @ and $, and why is it beneficial to hardwire that difference?

I had planned a longer answer, but as I'm writing it, I'm starting to question the utility of that difference myself.

My reasoning is that we want determinate references to only point to child directories for sandbox safety, but we also need some mechanism to create determinate references to parent directories... except I'm not actually sure we do?

The use cases I was thinking about were mostly shimming and alternate builds (unit tests vs debug vs release), but those don't really need to be baked in the package system.

I still think that if we add standard import maps, then we need to add them in a way that can't accidentally create shadowing problems, but I'm not sure import maps are useful in offline import resolution anymore.


Anyway, assuming we drop the $ syntax and import maps, do you think the rest of the proposal is worth standardizing?

The amended proposal would be:

  • Add an optional filesystem root object to the instantiation APIs (WebAssembly.instantiate, wasm_instance_new, wasm::Instance::make)

  • In tools conventions, add a specification for parsing import strings as URIs, with the sandboxing rules described above.

    • Carve out a syntax (@foo/bar or foo:bar or whatever the W3C settles on) for host imports.

Potential import map / export map / shimming mechanisms could be added later.

I've been feeling the need for something like this too. However, I feel like we're not yet equipped to specify anything without introducing some fundamentally new, topologically "bigger" concept (that contains multiple modules) such as "package" (unit of code in a package manager like npm) or "container" (unit of code that can be executed by a container runtime) because, as pointed out above, we can't assume some ambient filesystem exists whose paths we can talk about. Once you have a concept of "package" or "container" (both of which I think make sense, for different phases of deployment), then it's much easier to be concrete about how resolution works.

I also share @littledan's goal that, whatever gets defined should have the same interpretation inside and outside of a JS/ESM environment. That way you can have a single package of wasm modules that can be used just as well in Node as outside.

I share the strong feeling we need to start coordinating and aligning these proposals anticipating the need for a holistic concept of a package or container with each other. We should not exclude the web, but we should also keep the module sandbox intact. This proposal ties into the WebIDL proposal by @lukewagner and @jgravelle-google, the ESM integration proposal by @littledan and @linclark, and the WASI standard @sunfishcode @tschneidereit.

I see the key potential of linking and interfacing between Webassembly modules in the ability to:

  • link modules written in different programming languages in the same thread in a feasible and comparably efficient way that may avoid serialisation or IPC
  • contain security vulnerabilities in untrusted modules (accidental like buffer overflows or malicious) with the module sandbox with clearly defined (app like) permissions (or in WASI terms object capabilities) enforced from outside the module with the module interface as the security gate
  • allow building an ecosystem of universal modules that can easily be reused without the need to compile them into a single module or even a native executable which enables the ease of use of npm install for wasm modules with an acceptable performance tradeoff
  • allow for better loading times and caching of small webassembly modules instead of a large monolithic "bundle" (package managers like npm with tink, entropic and yarn are moving in the direction of file based hashing/caching of files with a global singleton instance instead of shipping whole packages copying redundant files. Also bundlers like webpack are evaluating the potential to skip bundling with http2.

I think the module sandbox of Webassembly that isolates untrusted code inside a module from the outside is something quite unique and has the potential for introducing security gates when using 3rd party code with security vulnerabilities such as buffer overflows or malicious modules that has gone widely unnoticed even within the community with some strong claims actually to the contrary from profiled experts in the community.

My concern is that if we don't raise the awareness for this module sandbox feature and coordinate these inter-module proposals to keep it, we might lose this feature before it has even seen the light of day and essentially we're back to process sandboxes. I know there are side-channel timing attacks that allow reading memory out of the module sandbox, but at least it's read-only and one can limit the damage if for example a jpg decoder module used within thousands of applications is compromised it cannot call home right away to send your password if its interface only has access to image buffers but no network APIs.

If done right Webassembly could be the start of a new child-safe(r) plug-and-play modular ecosystem that will help solve or at least contain a couple of epidemic and deep problems in current software systems such as compatibility, security, administrative effort (containers) and short lifespan of code.

I see this filesystem based approach as pragmatic, which might actually be quite close to what is needed, since a folder with a central index.js like in ES modules has proven to be a helpful natural next bigger unit of a module above a single file that still allows for efficient file caching while keeping the files depending on each other together. However I believe the module should not be able to directly import modules outside of its own directory, there a global unique reference like an npm package name that is resolved and security checked by the host would be needed.

Any thoughts on this?

Keeping the whole module sandbox thing is good. However, I'd avoid focusing on stuff like where the modules come from (implementation dependent), and more on the relationship between modules. For example, even within filesystems (and there are a lot of filesystems, not all of them even have a concept of file trees), there are disagreements about the rules of these capability systems (upward traversal, downward traversal, sibling files, symlinks, etc). JS's WebAssembly API doesn't even have a concept of the storage location of wasm, they're just blobs of data floating around the VM.

@devsnek Totally agree it should be an abstraction not depending on implementations of filesystems or even a filesystem but a more general and safe concept of hierarchical containment of packages.

hierarchical containment

Does it really need to be hierarchical in a sense of a tree of disjoint nodes? Or did you mean a directed graph (i.e. "tree with loops")? There are situations (actually not that unusual) when one needs a "non-cyclic dependency disguised as a cyclic dependency" - dependency of A of the module M1 on a B of the module M2 and at the same time a dependency of C in M2 on D in M1 whereas there is no (easy/supported/built-in/possible) way of making a deep query into the module and its data to find out whether it's actually a cycle or not. Resolving such dependencies only on the modules level (tree with disjoint nodes) would make such stuff impossible thus unnecessarily complicating development etc.

@dumblob For imports outside the subtree:

However I believe the module should not be able to directly import modules outside of its own directory, there a global unique reference like an npm package name that is resolved and security checked by the host would be needed.

I see a recursive parent child relationship could be helpful where the parent - not only the host - grants permission to its children. The parent can only grant permissions which it has been handed down from its own parent. This could include the permission to import a module that is not defined in the subtree but accessible to the parent. It could also decide to export modules inside of it to its own parent.

In the given example, the jpg-decoder Wasm module depends on a simd-math Wasm module. The jpg-decoder would declare the dependency with a global unique package identifier such as [email protected]/simd-math@^1.3.4 and request access to it from its parent. If the parent module has access to this module and it is considered "child safe" by the parent, access is granted and the imported simd-math module handed down to the jpg-decoder.

Re: filesystem

First, let's lay out what the technology stack is.

You have:

  • .wasm binary files,
  • The WebAssembly APIs (C/C++/JS),
  • The WebAssembly host (hand-written JS, Webpack, WASI interpreter),
  • The host's persistent storage (usually a filesystem),
  • A wasm package manager,
  • A central package repository,
  • Tools meant to upload to that repository.

So, strictly speaking, the wasm binaries and instantiation APIs don't need to be filesystem-aware. Binaries only need a standard import syntax, and the API only needs a callback to which it can pass the parsed import URIs. Eg:

WebAssembly.instantiateStreaming(
    fetch('someFile.wasm'),
    {
        // ...
        resolveImport: (parsedURI) => { ... }
    }
);

Assuming we use an API similar to the above example, then hosts could in theory store modules however they want; they could pull from a non-filesystem storage, eg by interpreting URIs as requests into a SQL database.

In practice, any developers who want to reuse community-made code would still need to populate that database with community-made modules, which would virtually always be built in a filesystem environment.

So while the host part of the above stack can be filesystem-agnostic, it has to interact with a filesystem-like package format, and a dependency system that is by nature tree-like (dependencies that have their own sub-dependencies that have their own sub-sub-dependencies, with no package name conflicts between different levels).

Overall, this all comes back to what @lukewagner and @ttraenkler said, that we need a holistic definition of what a package is (if not what filesystem format they use) before we can settle on a import scheme.


Does it really need to be hierarchical in a sense of a tree of disjoint nodes? Or did you mean a directed graph (i.e. "tree with loops")? There are situations (actually not that unusual) when one needs a "non-cyclic dependency disguised as a cyclic dependency" - dependency of A of the module M1 on a B of the module M2 and at the same time a dependency of C in M2 on D in M1 whereas there is no (easy/supported/built-in/possible) way of making a deep query into the module and its data to find out whether it's actually a cycle or not. Resolving such dependencies only on the modules level (tree with disjoint nodes) would make such stuff impossible thus unnecessarily complicating development etc.

Are these semi-cyclic dependencies really common once you contract intra-module subgraphs?

I'd expect that, if you treat each module as a single node, then in 99.9% of use cases you end up with an import DAG or an import tree.

I think it would make sense to have an import resolution scheme that assumes that dependencies may be cyclic inside a single package, but are tree-like otherwise (the npm model) with some non-cyclic exceptions (WASI, host libraries, peer dependencies, etc).

Okay, after going back and forth on this proposal I like the direction of it. However, I don't think there is any need to require specific prefixes or any other kind of "safety" mechanisms. Those should be handled by the build-system/pkg-manager, not the interpreter. A secure pkg-manager should be able to properly specify the complete dependency hierarchy and give it to the wasi interpreter, which then interprets as it is told. This relationship _might_ look something like this (semi-valid json):

[
  {
    "path": "./hash2k3jc/mod/foo.wasi",
    "dependencies": [
      {
        "import": ["mod", "bar"],  # (import "mod" "bar" ...)
        "path": "./hash2k3jc/bar.wasi",
      }
    ]
  },
  {
    "path": "./hash2k3jc/mod/bar.wasi",
    "dependencies": [],
  },
]

Here "mod foo" has 1 dependency ("mod bar"), which itself has no dependencies.

This is a flat list of wasi files (path) and specifies on how to map import statements to specific wasi files (dependencies with import and path).

You note that this build system chooses to use hashes to be 100% explicit and secure. Other build systems (i.e. closed, internal) could choose to be more lax to allow in-place updates. The point is that this representation can support either approach.

IMO this most closely reflects https://webassembly.org/docs/dynamic-linking/ -- placing absolutely no requirements on the wasi files or interpreter, instead choosing to give maximum flexibility and simplicity.

Note: I think assuming a file-system for the wasi interpreter is appropriate.

Note: I am working on a "build system for the web" wake and would like to (eventually) have wasi be the primary target _and_ build environment. I would like to be able to ship signed dependencies so that sharing libraries is more lightweight than packaging-the-world.

Note: I think assuming a file-system for the wasi interpreter is appropriate.

I actually have quite an opposite opinion :wink:. Already nowadays there are serious WASM use cases where there is absolutely no file system and never will be (e.g. new or experimental operating systems having no file systems, but just DB-like programmatic interface providing just the most basic persistent storage addressing - in its simplest form it might be a non-volatile RAM hardware without any need of tree-like structure like filesystem). So a filesystem must not be a prerequisite for a WASI interpreter (IMHO not even a universal WASI package manager shall require filesystem-like storage, but shall rather work with a linear persistent storage than with tree-like storage or block storage or key-value storage or whatsoever).

Possibly, but I also don't want to require that all dependencies are on a single continuous block of storage either :smile: (such as using position and size instead of path). path could be used as a key in a DB-like programming interface, no? Maybe we should call it something other than path... key maybe?

I see having some mechanism to look up binary blobs by a key as a pretty essential feature of a package manager :smile:

in the js spec, we don't actually give modules any sort of specifier. We perform resolution by (referringModuleInstance, importString), and the host has to make their own sense about whether that import should be allowed, where it resolves to, etc. I think Wasm is currently even vaguer, and WASI can impose additional constraints like "if the result of resolving (referringModuleInstance, importString) should not be transitively accessible to the importing module, exit with xyz code". To wasmtime, "transitively accessible" probably involves unix-like file permissions. To something like fastly/cloudflare, it probably means "part of the same user-uploaded archive".

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aaabbbcccddd00001111 picture aaabbbcccddd00001111  ·  3Comments

nikhedonia picture nikhedonia  ·  7Comments

cretz picture cretz  ·  5Comments

frehberg picture frehberg  ·  6Comments

Artur-A picture Artur-A  ·  3Comments