Rust: Make Rust work with emscripten

Created on 18 Apr 2012  ·  30Comments  ·  Source: rust-lang/rust

I've spent some time poking at this problem and rustc now generates code that emscripten can translate, but the compiled javascript fails when it hits a runtime function. The next step is to start building the runtime using emcc as the compiler. Stub out all the things that don't build behind EMSCRIPTEN ifdefs.

Emscripten is adding a way to treat inline assembly as javascript, so all the parts of the runtime that don't build with emscripten can be implemented inline with javascript.

Alternately, we could reimplement the runtime piecemeal in javascript and not bother compiling it from C++ at all. This approach isn't recommended.

A-runtime E-hard

Most helpful comment

I'm pulling a massive triage effort to get us ready for 1.0. As part of this, I'm moving stuff that's wishlist-like to the RFCs repo, as that's where major new things should get discussed/prioritized.

This issue has been moved to the RFCs repo: rust-lang/rfcs#604

All 30 comments

See also #3608.

Would still be nice; not on any maturity milestone.

Still would be nice, but is not very important. This should get easier, since a lot of the runtime is being rewritten in rust.

Now that the runtime is written in Rust, how does that change the prospects for this bug? How hard would it be to get a runtimeless Hello World running through emscripten?

It should not be particularly difficult to add really nice support for emscripten now. It already pretty much works with rust-core. In the compiler we need to add support for the proper target triple, set up the various target attributes correctly, then fence off the few parts of the runtime that can't currently work in js, threading and context switching.

Once 1:1 scheduling mode matures a bit more we may even be able to add support for tasks via web workers, though presently it would require a different message passing solution. Depending on what parallelism support gets added to js/emscripten we may eventually be able to support rust's message passing semantics precisely.

@brson: I think #10780 would be the biggest blocker at the moment. Rust will output landing pads with calls to update the size used to do stack safety via LLVM's segmented stack support.

Thanks to -Z no-landing-pads this now works fine! Adding explicit support for this to the standard library is possible, but most most isn't going to work anyway (files, tcp, udp, etc.) so I don't think it is necessary. If and when the standard library picks up freestanding support, it will start working and we can open more issues based on functionality we can map to JavaScript.

If it's ok, I'd actually like to leave this open for now. I think that doing this would be a good step forward towards ensuring the standard library is extensible and able to run on any number of platforms.

I do agree that most of the work is probably done, and this will probably need libemscripten to provide emscripten-specific I/O, but I think that there may be enough snags hit along the way that this is worth leaving the issue open for (it's still an interesting project!)

@alexcrichton: It won't be possible to provide the standard library's concurrency and I/O support for emscripten. At best, it could output to the console for stdout/stderr. I can't think of any things in the standard library that are going to be a good idea for an emscripten target but not a freestanding one, beyond a default allocator implementation.

Status update:

@alexcrichton Has refactored the standard library into a bunch of smaller libraries with more understandable dependencies. It should be almost trivial to get the core, alloc, rand, and collections libraries to compile to the web now.

Here is how I would suggest tackling this:

  • Do some experiments with the rust and LLVM toolchains to understand how the codegen works for cross-compiling to asm.js.
  • Build libcore with emscripten manually and prove that it works on the web.
  • Modify rustc and mk/platform.mk to understand an emscripten-specific target triple, produce a cross-compile toolchain that contains nothing but libcore.rlib
  • Tackle liballoc by allowing it to work with the system (in this case emscripten-provided) malloc, then the other runtime-free crates.

That's a pretty good start!

Ok, so I tried with the first steps, and obviously got problems right off the bat.

I compiled libcore to bitcode with --emit bc, and when trying to compile it with emcc -O0, I get:

/Users/arcnor/emscripten-fastcomp/build/bin/llvm-nm: /tmp/tmpfTkmfj/core_0.o: Invalid CMPXCHG record.
/Users/arcnor/emscripten-fastcomp/build/bin/opt: /tmp/tmpfTkmfj/core.bc: error: Invalid CMPXCHG record
Traceback (most recent call last):
  File "/Users/arcnor/emscripten/emcc", line 1573, in <module>
    shared.Building.llvm_opt(final, link_opts)
  File "/Users/arcnor/emscripten/tools/shared.py", line 1335, in llvm_opt
    assert os.path.exists(target), 'Failed to run llvm optimizations: ' + output
AssertionError: Failed to run llvm optimizations:

Not sure if I can do anything about this, or it's because we cannot use the rustc --emit output for this.

Sorry if this is not the place to comment about this...

I also tried with libnum, a more simpler one, and the bc generates correctly, but I get a warning during the emcc process about using the wrong triple and the resulting .js doesn't have any of the functions inside libnum, so I think I'm being too naive here :)

@Arcnor You might ask some of those who have previously compiled simple tests with emscripten about their process. I only have a few ideas.

  • LLVM bitcode changes from version to version and the version that Rust uses is not always the same as emscripten. Getting them both using a relatively similar version of LLVM can improve compatibility.
  • From your error message you appear to be using emscripten's new 'fastcomp' backend. This may be less tested on Rusty workloads than their old backend. Manually opting into the old backend may at least produce different results.
  • Emscripten generally uses its own target triple, so rustc may need to be pursuaded to use the same one.

The error when trying to compile libcore appears to be related to this emscripten issue. Compiling libcore to llvm bytecode generates llvm atomic instructions, but emscripten does not support atomic instructions.

There may be a way to work around this from the rust side, but based on the comments in the emscripten issue I think getting support for atomics into emscripten makes the most sense.

If emscripten has its own platform we could perhaps cfg-out all the atomics for their single-threaded variants, but I agree that it would be nicer to have this in upstream emscripten!

If I'm not mistaken, the new "fastcomp" backend of emscripten is a fork of LLVM (while the previous backend was just a layer above LLVM), so the LLVM version of fastcomp is probably difficult to upgrade and won't be upgraded frequently.

This will be problematic if it needs to be compatible with the output of Rust. For example right now the LLVM version of fastcomp is 3.3, while the LLVM used by Rust is 3.4.

The old emscripten backend is deprecated and shouldn't be used according to the official docs, so it's probably not an option to use it.

I seem to be the only one trying to compile for emscripten for the moment.

For the record, here are the things that I tried:

  • Compiling to bytecode (generated by Rust's LLVM 3.4) and passing it to fastcomp (fork of LLVM 3.3) ; causes fastcomp to crash
  • Compiling to IR, editing it manually until it is compatible with LLVM 3.3 and passing it to fastcomp ; too complicated, too many things to modify for any non-trivial code
  • Compiling Rust stage1 with --llvm-root pointed to emscripten's fastcomp ; that didn't work because they removed support for ARM/MIPS/etc. in their fork (I'm getting errors from the makefiles and during linkage because of this)
  • Modify the LLVM git submodule in Rust's source code to point to an old commit from the 3.3 era ; getting a segfault at some point in LLVM
  • Compiling Rust with --llvm-root pointed to a pre-compiled LLVM 3.3 (coming from the official ubuntu repo) ; getting an assertion failed at the end of the compilation of stage1 and the produced rustc binary doesn't work.

Unless someone has an idea, my conclusion is that we need to wait for emscripten to upgrade.

rum seems to have it working, sorta; perhaps this will help

Minor update: emscripten-fastcomp has been updated to LLVM 3.4, and will be updated to LLVM 3.5 later.

@tomaka have you tried doing anything with the 3.4 version? I was able to get the rum example compiling with it, but anything more failed with unintelligible errors.

@ibdknox 3.4 is incompatible with 3.5
Even a simple hello world produces a failed assertion: LLVM ERROR: 0 && "some i64 thing we can't legalize yet"

Hm. I was able to take the output from rustc --emit ir foo.rust and run it through emscripten-incoming. Is rust now on LLVM 3.5?

Rust has been using LLVM 3.5 for a long time now. You can be lucky and have nothing incompatible get generated.
For example this compiles just fine:

#[start]
fn main(_: int, _: *const *const u8) -> int {}

This doesn't because of incompatible IR:

fn main() { println!("hello world"); }

@ibdknox http://www.reddit.com/r/rust_gamedev/comments/2n0x08/emscripten_experiments/
It looks like there are fewer incompatibilities than I thought.

As an update, when I compile hello world with emscripten that has now been updated to 3.5, I'm getting the following:

Value:   %28 = call fastcc { i8, [0 x i8], [0 x i8] } @_ZN3fmt5write20h2c56fdda0b308d94DFAE({ i8*, void (i8*)** }* noalias nocapture dereferenceable(8) %arg.i, %"struct.core::fmt::Arguments[#3]"* noalias nocapture readonly dereferenceable(24) %__args31), !noalias !22
LLVM ERROR: Unrecognized struct value
Traceback (most recent call last):
  File "/Users/chris/Downloads/emsdk_portable/emscripten/incoming/emcc", line 1259, in <module>
    shared.Building.llvm_opt(final, link_opts)
  File "/Users/chris/Downloads/emsdk_portable/emscripten/incoming/tools/shared.py", line 1401, in llvm_opt
    assert os.path.exists(target), 'Failed to run llvm optimizations: ' + output
AssertionError: Failed to run llvm optimizations:

Here's how I'm compiling:

rustc --target i686-apple-darwin -C lto --emit ir foo.rust
emcc -v foo.ll -o test.html

Things that don't seem to bring in fmt generally seem to work, though.

I've been spending my free time this last week looking into this. I read rust's book sometime between the summer and now and really liked the mechanics of the language but only recently started implementing something with it. I'm only as knowledgable about the rust compiler as to what I learned this week but hope I can contribute.

So I guess the first thing to note of what I learned (but that took me a few evenings to notice) is that Rust moved to LLVM 3.6 in July. So the current versions of Rust and emscripten-fastcomp are incompatible.

I tried compiling rust with --llvm-root pointing to emscripten-fastcomp 1.29.2 and got this error:

rustc: x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib/libcore
error: internal compiler error: unexpected panic
note: the compiler unexpectedly panicked. this is a bug.
note: we would appreciate a bug report: http://doc.rust-lang.org/complement-bugreport.html
note: run with `RUST_BACKTRACE=1` for a backtrace
thread 'rustc' panicked at 'assertion failed: self.raw.hash != self.hashes_end', /Users/zen/Code/rust/src/libstd/collections/hash/table.rs:776


make: *** [x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib/stamp.core] Error 101

To get to this error I configured and built emscripten-fastcomp with

../configure --enable-optimized --disable-assertions --enable-targets=host,js,arm,aarch64,mips

Instead of the emscripten's guide's recommended

../configure --enable-optimized --disable-assertions --enable-targets=host,js

Though Rust doesn't need to be built for all targets, it currently always links against LLVM with CPU support compiled for all targets. This is a workaround for a problem that could be fixed in the future so we may not need to always compile emscripten-fastcomp with that configuration.

Once I found that Rust had moved to LLVM 3.6 I looked up the last branch on rust-lang/llvm that was LLVM 3.5. https://github.com/rust-lang/llvm/tree/rust-llvm-2014-07-24 I compiled against that instead of emscripten-fastcomp, curious to see what would turn out. I got the same exact error when compiling against emscripton-fastcomp recent move to LLVM 3.5. I take this to mean that Rust is in some way incompatible with LLVM 3.5 now and I wouldn't really expect otherwise.

So now we wait or have to get emscripten-fastcomp to LLVM 3.6 :wink:

Its worth mentioning that I downloaded an archived 0.11 copy and was able to produce LLVM IR for hello world that emcc understood but then reached the problem of linking. It was pretty exciting to see it get past understanding the byte code but actually getting it linking is going to need work in the rust code base.

I took a peek at merging rust-lang/llvm into emscripten-fastcomp. At the time there 117 conflicting sections over 43 files.

I mentioned getting Rust 0.11 and emcc 1.29.2 to get to the linking stage. This is the specific result:

$ emcc -v hello.ll -o hello.js
INFO     root: (Emscripten: Running sanity checks)
WARNING: Linking two modules of different data layouts: '/Users/zen/.emscripten_cache/libc.bc' is 'e-p:32:32-i64:64-v128:32:128-n32-S128' whereas '/tmp/tmpv_yB8E/hello_0.o' is 'e-p:32:32-f64:32:64-f80:128-n8:16:32'
WARNING: Linking two modules of different target triples: /Users/zen/.emscripten_cache/libc.bc' is 'asmjs-unknown-emscripten' whereas '/tmp/tmpv_yB8E/hello_0.o' is 'i686-apple-darwin'
warning: incorrect target triple 'i686-apple-darwin' (did you use emcc/em++ on all source files and not clang directly?)
warning: unresolved symbol: _ZN2io5stdio12println_args20h0caae70b0e2eb347Iol7v0_11_0E
warning: unresolved symbol: _ZN10lang_start20h70f93b7d0a75f99atre7v0_11_0E

Seems emcc/fastcomp replaces dots in symbols with underscores while Rust expects to prefix with another underscore but I'm not too sure about this. The first unresolved symbol appears as __ZN2io5stdio12println_args20h0caae70b0e2eb347Iol7v0.11.0E in libstd in the i686-apple-darwin build. Even if I could get emcc to know how to find this symbol in the built libraries, I'm guessing that the libs contain machine code while emcc will need LLVM byte codes. I think I recall someone mentioning needing to compile the standard library for emscripten. This would be part of the need for that.

So here are the next steps I'm looking to try and work on if anyone wants to take a shot at them themselves. (Or can let me know how right or wrong I am.)

  • Merge rust-lang/llvm into emscripten-fastcomp
  • Build rust with merged fastcomp without JS backend support
    Hoping this will be a nice sanity test for the merge.
  • Add emscripten's triple to Rust and build it
    From what I can tell there are a number of files I need to change or add.

    • mk/cfg/asmjs-unknown-emscripten.mk

    • rt/arch/asmjs/{morestack.S,record_sp.S} (might be able to be empty?)

      These files are needed to build morestack.a for Rust to support LLVM's segmented stack. If I recall correctly Emscripten's stack is also the head. It uses the opposite end of the heap as the stack and since for asmjs you can't change the size of the array creating new stack segments isn't possible. I saw a field in TargetOption in the librustc_back target files that hopefully can disable this.

    • librustc_trans/trans/cabi_asmjs.rs

    • librustc_trans/trans/cabi.rs

      I'm not sure if these will be needed, currently just a guess.

    • librustc_back/target/asmjs_unknown_emscripten.rs

    • librustc_back/asmjs.rs

    • librustc_syntax/abi.rs

    • librustc_back/back/write.js configure_llvm()

    • librustc_llvm/lib.rs static_link_hack_this_sucks()

  • Implement missing system interfaces
    I saw in November libgreen was removed. Since emscripten needs to wait for some way to share with workers in browsers, if it ever happens, something like libgreen would need to be restored or pthread shimmed in some fashion specifically for emscripten like how rust builds for pthreads or windows threads api.

IO too. Probably other parts I'm unaware of.

"Merge rust-lang/llvm into emscripten-fastcomp"

You might not want to do this - Emscripten is based on pnacl-llvm/pnacl-clang so you're creating a fork with patches on patches, which will probably be painful. If you're interested, you can see some detail of the branching in the investigation I did for the Emscripten merge from r33 -> r34 at https://github.com/kripken/emscripten-fastcomp/issues/51#issuecomment-62323164.

I hear that pnacl is planning on tracking upstream a bit closer than before but I can't see any relevant issue in the pnacl issue tracker to update to 3.6 so it may be a while (especially given 3.6 only branched 5 days ago!)...I guess you could create an issue? If you decide against your own Emscripten fork, I see two options - waiting for pnacl or helping Emscripten get off pnacl and onto upstream.

Edit: corrected 'now' to 'not'. A crucial difference.

I'm pulling a massive triage effort to get us ready for 1.0. As part of this, I'm moving stuff that's wishlist-like to the RFCs repo, as that's where major new things should get discussed/prioritized.

This issue has been moved to the RFCs repo: rust-lang/rfcs#604

Was this page helpful?
0 / 5 - 0 ratings