mruby irb now runs in a browser

Published on:

After some time's work, now I finally got a working irb for mruby. I'm such a lazy guy so you may already seen the demo from this, this or this. Anyway, for those of you who didn't see it, the demo is at here.

With all the work in webruby, actually it is not so hard to implement this. However, there are still two things I want to write down here as notes.

Passed-by-value Structs

For simplicity, the web irb uses mrb_load_string to parse and execute ruby source together. Now here comes the problem, the function signature is like following:

mrb_value
mrb_load_string(mrb_state *mrb, const char *s);

Here mrb_value is a struct type. As a return value, it is passed by value here. This would forces emscripten to generate a JavaScript function like following:

function _mrb_load_string($agg_result, $mrb, $s) {
  // code omitted...

}

The $agg_result variable is used to "marked" a place in heap so as to store the return value, the consequence of which is that it is hard for us to make up a reasonable value in JavaScript. So we have to think of other ways.

Luckily, we do not need the return value here. Hence we can simply create a wrapper:

int driver_execute_string(mrb_state *mrb, const char *s)
{
  mrb_load_string(mrb, s);

  return 0;
}

If at later times we decide to add logic to check the return value of mrb_load_string, we can simply added it here. For this driver function, the generated js function would only requires two arguments: the mrb state and the string to load.

JavaScript code optimization

The generated JavaScript library is around 4.5M, it contains a lot of spaces and comments. Normally we do not want the browser to load "human-readable" JavaScript source code, an optimizer is in need here.

Emscripten has a built-in optimizer, but it wouldn't work with mruby:

Stack: Error
    at assertTrue (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))
    at substrate.addActor.processItem.item.functions.forEach.Functions.blockAddresses.(anonymous function) (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))
    at Array.forEach (native)
    at substrate.addActor.processItem.item.functions.forEach.Functions.blockAddresses.(anonymous function) (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))
    at Array.forEach (native)
    at substrate.addActor.processItem (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))
    at Array.forEach (native)
    at Object.substrate.addActor.processItem (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))
    at Object.Actor.process (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))
    at Object.Substrate.solve (eval at globalEval (/Users/rafael/develop/webruby/modules/emscripten/src/compiler.js:103:8))

undefined:54
    throw msg;
          ^
Assertion failed: Only some can lead to labels with phis:_mrb_run,51,indirectbr
Traceback (most recent call last):
  File "/Users/rafael/develop/webruby/modules/emscripten/emscripten.py", line 402, in <module>
    temp_files.run_and_clean(lambda: main(keywords))
  File "/Users/rafael/develop/webruby/modules/emscripten/tools/shared.py", line 420, in run_and_clean
    func()
  File "/Users/rafael/develop/webruby/modules/emscripten/emscripten.py", line 402, in <lambda>
    temp_files.run_and_clean(lambda: main(keywords))
  File "/Users/rafael/develop/webruby/modules/emscripten/emscripten.py", line 358, in main
    emscript(args.infile, settings, args.outfile, libraries)
  File "/Users/rafael/develop/webruby/modules/emscripten/emscripten.py", line 228, in emscript
    for func_js, curr_forwarded_data in outputs:
ValueError: need more than 1 value to unpack
Traceback (most recent call last):
  File "/Users/rafael/develop/webruby/modules/emscripten/emcc", line 1092, in <module>
    final = shared.Building.emscripten(final, append_ext=False, extra_args=extra_args)
  File "/Users/rafael/develop/webruby/modules/emscripten/tools/shared.py", line 902, in emscripten
    assert os.path.exists(filename + '.o.js') and len(open(filename + '.o.js', 'r').read()) > 0, 'Emscripten failed to generate .js: ' + str(compiler_output)
AssertionError: Emscripten failed to generate .js: 

Oops, maybe I need to turn to Alon for help here-_-

But luckily, we still have Closure Compiler. It works on mruby source code. With simple optimizations we can strip the generated JavaScript source file to around 1.6M. This looks like a workable solution. Advance optimizations require we export the driver functions, otherwise all mruby related source code will be cut out since we never use them in this single file. Well, I'll come back to this later, 1.6M does not look that bad already~

Conclusion

Now the mruby irb runs in a browser, and the mruby tests also pass in either Node.js or a browser. The fun part can continue. I know I said in my previous post that I will work on a OpenGL ES 2.0 API, well, the thing is I've got a nice idea on a mruby-to-JavaScript calling interface. If I can get this working, we will have the WebGL API, canvas API, Web Audio API, etc at hands instantly! Sounds nice, huh? And of course, it can and will be organized as a mrbgem, except that a small JavaScript part is needed to attached to the generated JS file via emscripten.

Anyway, I've already created the repository for this, let's see if I can make this work:)

It's official: mruby-browser is now called WebRuby!

Published on:

After a pretty long discussion and fix(Thanks to Alon for the awesome job in fixing this issue!), now setjmp/longjmp in emscripten works well. The C++ dependency in mruby-browser can be finally dropped, and all the mruby tests can pass without any particular hack. We can now say that the basic building block is there, and I can turn to focus on more interesting stuffs. It is also at this time that I think a new meaningful name is needed for this project. The original mruby-browser sounds too much like an experiment instead of a project for everyone to use.

The first thing came to my mind is RubyScript. With something already built called CoffeeScript or ClojureScript, it is not so hard for this particular name to come to my mind. However, Google tells us that someone has already uses this name. It looks quite like a hackathon project(20 commits in two days, and no commits since then for over a year). But I still do not want to take the risk that this guy(or lady, it is so hard for me to judge this by the name, can anyone give me a hint?) may want to bring this project back to life. Let's try something else.

What's worth mentioning is that someone is building MobiRuby, which brings mruby to iOS. Well, I'm bringing mruby to the Web, so what about WebRuby? Google tells us that there aren't so many people using this name, one is using this as a repository for a web course, while the other is just a sinatra-based web backend. Well, I will just pick this one as the name:)

Besides changing the name, I also change the build script a little bit. Now we can simply put mruby source code in src folder, the source code will be parsed and compiled when building the project. Only the final generated bytecode will be included in the js file or the webpage. This saves us the time for parsing the source code online. If at some point we need to parse the mruby source code online, we can easily bring this back since the parse code is still included in the generated js code.

Personally, one good thing about owning an open source project is that you can decide the priority of each feature:) As a developer who always dream about creating games, my next thing to work on will be a wrapper for OpenGL ES 2.0 API. I still haven't thought of some beautiful ideas on a calling interface between mruby and c(or js), so I think I will first focus on some particular library wrappers. Since mrbgems is already in HEAD, it becomes a natural idea to pack this library as a mrbgem. The reason for choosing OpenGL ES 2.0 over WebGL is that OpenGL ES 2.0 is more general, and maybe it can also be used with mruby in iOS or Android development. What's more, emscripten provides a translation between OpenGL ES 2.0 and WebGL, so we can simply take advantage of that instead of write yet another wrapper of WebGL.

Just as I said before, it is really fun working on this:)

Make mruby tests pass in a browser

Published on:

With my Hadoop paper submitted last Friday, I can spend more time playing with mruby. Now after several days' hacking, I finally manage to make all mruby tests pass in a browser or in node.js.

$ make test
make[2]: Nothing to be done for `all'.
make[2]: Nothing to be done for `all'.
make[2]: Nothing to be done for `all'.
make[2]: Nothing to be done for `all'.
Running mruby test in Node.js!
node ./build/mruby-test.js
mrbtest - Embeddable Ruby Test

This is a very early version, please test and report errors.
Thanks :)

......................................................................................
......................................................................................
......................................................................................
......................................................................................
......................................................................................
.............................................................
Total: 491
   OK: 491
   KO: 0
Crash: 0
 Time: 1.999 seconds

Now it's time to keep a note on how to make these tests passed.

As this Issue is resolved(Thanks to Alon Zakai for his super fast commit to fix this!), the mruby source code can be compiled using emcc successfully, the sample main.c file also works. But there are still 5 tests left that are not passed: 2 of them failed, while the other 3 caused node.js to crash. These 5 tests are:

  • Tests for erf and erfc functions in math.rb
  • Float#round [15.2.9.3.12] in float.rb
  • String#to_f [15.2.10.5.39] in string.rb
  • Exception 14 in exception.rb
  • Proc.new [15.2.17.3.1] in proc.rb

To be honest, the result is quite good, since only 5 of the 489 tests got problems. I guess emscripten really has reached a pretty mature status thanks to Alon. Most of the fixes here are resolved from commits to mruby or emscripten directly. However, there are also annoying ones. Anyway, I will explain how to make each of them pass.

erf and erfc functions

Honestly, this is the first time that I heard about these two functions. They reside in the math.h header file of standard C library. The erf function is used to calculate the error function of a value x. While the 'erfc' function calculates the complementary error function of x. emscripten does not come with an implementation for this function. However, there is an implementation in math.c of mruby for MSVC, which does not provide erf/erfc functions. It was originally take from here:

double
erf(double x)
{
  static const double two_sqrtpi =  1.128379167095512574;
  double sum  = x;
  double term = x;
  double xsqr = x*x;
  int j= 1;
  if (fabs(x) > 2.2) {
    return 1.0 - erfc(x);
  }
  do {
    term *= xsqr/j;
    sum  -= term/(2*j+1);
    ++j;
    term *= xsqr/j;
    sum  += term/(2*j+1);
    ++j;
  } while (fabs(term/sum) > MATH_TOLERANCE);
  return two_sqrtpi*sum;
}

What's worth noting is that the original mruby implementation contains a bug which will give wrong results for negative values. The original post from digitalmars also has a fix for this problem. It was just the case that the original commiter uses the earlier version without the fix. Hence a simple commit to the mruby project solved this problem. A similar version in JavaScript could also be implemented, the erf/erfc test would then pass.

Fload#round test

This is an interesting and easy one. The test code resides at here. Actually all the round tests give the correct result, what went the wrong is that == is used to test equality for two floating point values. A small commit fixes this, easy one.

String#to_f test

This is also related floating point value. The code is at here. b should be assigned to 123456789.0, when using check_float to compare b with 123456789.0, they should be treated as equality. Funny thing is that node.js would give the result of 1.4901161193848e-08 as the difference between the two values, while check_float would only consider two values to be the same if they are within 1E-12.

Simply changing Line #328 to 123456789 instead of 123456789.0 would give the correct result, but this is a very bad fix for this problem and does not really solve it. Basically there may be two reasons:

  1. Somewhere in the generated JavaScript code of emscripten, the code does not treat the floating point value well.
  2. v8 does not provide that many precisions for floating point value.

It is still unknown which is the cause for this problem. What I choose to do now is to let mruby use float instead of double. When using float, check_float would accept two values within 1E-5, for which the current result of 1.4901161193848e-08 will be enough. Anyway, I will come back to this later, maybe a dig into the v8 issue list can bring some insight into this.

Exception 14 and Proc.new [15.2.17.3.1]

Both the exception and proc tests crash node.js, and they both use a begin ... rescue ... end statement with a method call in the begin clause. A simple guess is that they are due to the same reason.

I spent a whole day debugging this problem by inserting debug statements in mruby source code, reading generated logs as well as JavaScript source code written in assembly style. The LABEL_DEBUG option in emscripten proves to be a huge help here(thanks again, Alon!). Finally the problem turns out to be the need for stack manipulation setjmp/longjmp. I prepared a gist describing this problem:

#include <setjmp.h>
#include <stdio.h>

typedef struct {
  jmp_buf* jmp;
} jmp_state;

void stack_manipulate_func(jmp_state* s, int level) {
  jmp_buf buf;

  printf("Entering stack_manipulate_func, level: %d\n", level);

  if (level == 0) {
    s->jmp = &buf;
    if (setjmp(*(s->jmp)) == 0) {
      printf("Setjmp normal execution path, level: %d\n", level);
      stack_manipulate_func(s, level + 1);
    } else {
      printf("Setjmp error execution path, level: %d\n", level);
    }
  } else {
    printf("Perform longjmp at level %d\n", level);
    longjmp(*(s->jmp), 1);
  }

  printf("Exiting stack_manipulate_func, level: %d\n", level);
}

int main(int argc, char *argv[]) {
  jmp_state s;
  s.jmp = NULL;

  stack_manipulate_func(&s, 0);

  return 0;
}

The original gist also comes with logs running this natively or via emscripten. With a stack manipulating setjmp/longjmp, the longjmp would erase the stack for level 1 calling of stack_manipulate_func. The program would only call the exiting printf once. However, with the current implementation of setjmp/longjmp in emscripten, the stack is not changed, the exiting printf will be call by both the level 0 and level 1 version of stack_manipulate_func.

I don't think it is very likely that we will have a stack manipulation setjmp/longjmp in JavaScript. So the use of setjmp/longjmp needs to be removed from mruby. But wait, does this sound similar? Didn't I just come up with a solution a few days earlier? Well, it is just in my first post on mruby. I created a patch to use C++ exception instead of setjmp/longjmp and then found out that our simple main.c file does not need this to run. Well, now we do. So I have to bring it back. This is really bad news for a pathetic C99 lover to find that the dependency for a C++ compiler returns-_-

Quick Update: Actually Alon confirms that this is just a bug and it is fixed. So maybe we can still have a C99 solution on this problem. Interesting, I will take a look at this later. I really feel sick about using a C++ compiler, that may bring a lot of evil stuff when working on the later parts involving more C code, such as C function calling interface.

Anyway, now all the tests have passed. Not only in node.js but also in browsers. However, the time to run tests differ greatly:

  • Chrome 23: 1.682s
  • Firefox Aurora 18: 5.294s
  • Safari 6: 0.394s

This is interesting. Safari is so fast that one can think something went wrong for the other two.

At this time, I believe the testing for mruby in JavaScript has finished. I will spend some time trying to create a irb for the browser and see if I can get it on repl.it. After that I can finally spend the time on C function calling. I wish it could be more fun than debugging the generated JavaScript code-_-

Running mruby in a browser

Published on:

tl;dr version: I managed to compile mruby to JavaScript via emscripten, the source code is in a Github repository.

A Little Background

I really love Ruby, yet I'm not that much into Rails. Rails is great indeed, I really learned so much while playing around with Rails. In fact, more than half of my knowledge using Ruby was learned reading the Rails source code. However, I just feel bored writing again and again applications working on a database-_- Whatever new features my app has and whatever facinating features Rails provide, basically I just keep writing code to creating entries in db, reading these entries out, updating and deleting them occasionally(Thank you, CRUD!). After all, the MVC architecture is there. It is simply not fun for me>_<

So I always wonder around looking for interesting stuffs outside of the Rails world. Celluloid, Gosu, Fluentd are all fun to play with. The latest toy around is mruby, a lightweight implementation Ruby. It works for embedded system and can be linked into existing softwares. Well, that sounds like Lua, but as a former-Lisp fan, Ruby is more interesting, isn't it? Sorry I've gone a little off-topic, I could write another post describing my feelings on different languages, but that's not the point today:)

Last Saturday I suddenly got such an idea: mruby is light-weighted (around 20,000 lines of code), has a small footprint(~100k according to Matz's talk) and no threads. So why not try running it in a browser? Having the experiment of writing Web apps using GWT, I never trust that we must use JavaScript to write code running in a browser. Now that I've got some time, let the fun begin:)

Building mruby

Two choices exist to compile C/C++ code into JavaScript: Native Client and Emscripten. I used to fall in love with everything marked with Google, but things have changed. Let's first go with Emscripten and see how everything goes, we have a much larger world of Web instead of Chrome-only kingdoms.

The current mruby compiling process works like this:

  1. Use bison to parse src/parse.y into src/y.tab.c.
  2. Compile every c source in src, the generated object files are then archived into lib/libmruby_core.a.
  3. Compile tools/mrbc/mrbc.c and link with lib/libmruby_core.c to generate the core mruby compiler bin/mrbc.
  4. Use the mruby compiler to standard libraries in mrblib, the generated bytecode is attached as an array to mrblib/init_mrblib.c. The newly created source code is called mrblib/mrblib.c.
  5. Compile mrblib/mrblib.c and add to lib/libmruby_core.a. The result is called lib/libmruby.a.

We need to compile all source code src together with the generated file src/y.tab.c and mrblib/mrblib.c. For simplicity and the unstable state of mruby, I just require building the entire mruby first in my Makefile:

# mruby settings
MRUBY_PATH := ./modules/mruby
MRBLIB_PATH := $(MRUBY_PATH)/mrblib

MRBLIBC := $(MRBLIB_PATH)/mrblib.c
YC := $(MRUBY_SRC_DIR)/y.tab.c

# yacc compile
$(YC) :

    @(cd $(MRUBY_PATH); make)

# mrblib.c compile
$(MRBLIBC) :

    @(cd $(MRUBY_PATH); make)

Now the code can be compiled into a giant JavaScript file.

C compiler vs C++ compiler

In short, it was a tough and mysterious road-_-

The first version compiles okay, but it keeps running into RangeError: Maximum call stack size exceeded error. While browsing for wiki page of emscripten, one sentence in CodeGuidlinesAndLimitations caught my eye.

"Nonportable code that uses low-level features of the native environment, like native stack manipulation (e.g. in conjunction with setjmp/longjmp. We support normal setjmp/longjmp, but not with stack replacing etc.)."

I suspect this may be the reason, what supports my idea is one commit in emscripted-ruby, which is a port for ruby 1.8.7 onto the browser. It basically eliminates all setjmp/longjmp calls with c++ exception. So I decided this may be the direction and started coding right way. Well, now it turns out this is a mistake, I should've done more investigation. Anyway, you will see the mistakes I made.

Then I spent around ten hours figuring out a way to patch mruby to use c++ exception instead of setjmp/longjmp with all tests passed. The compiler is also changed from emcc to em++. Well, now the mruby compiles and works:

$ node build/mruby.js
Ruby is awesome!
Ruby is awesome!
Ruby is awesome!
Ruby is awesome!
Ruby is awesome!

This was Wednesday night and I went to read the FreeBSD book for exam. On Thursday morning, I accidentally compiles the code with em++ but without my patches for setjmp. The code happened to work! Even if I revert the code back to the initial revision with all original build settings, I still cannot reproduce the RangeError error. Well, it remains a mystery and all I can guess is that something is wrong with my installation of Node.js at that day. It might use a different setting that day...

Then another problem came: when compiling with emcc, the code compiles okay, but would run forever without terminating; It would only work when compiling with em++... This is also strange, I will look into this later.

So the lesson I learned is: do more investigation before pumping out coding. There may be another reason.

Future Work

Well, this is really fun. I've sent a post on emscripten-discuss. And I will spend some time on this.

Besides that, a few interesting follow-up works are already in my head:

  • Interfaces to C and JavaScript libraries
  • OpenGL ES 2/WebGL binding
  • Repl in the browser
  • MobiRuby for the browser(well, this may be too ambitious)

Anyway, it is really fun working on this.

Note

If you read all the way to here(I would not read such a long article myself-_-), I want to say sorry for my English. I'm only fluent in Chinese, but I just want more people to be able to read about this:)

新的开始

Published on:

从sina到blogpost,再到dreamhost,vps。不知道这是第几个blog了。好久以前看Joel同学的文章说,技术人员应该去写blog。这之后几次跃跃欲试,但是总是有各种理由荒废掉blog。

我希望用这个blog来记录一段新的开始,虽然我依然不确定会不会什么时候荒废掉这里。从今天开始,再不去关心那些无聊的GPA,学校考试考几分的问题。我读本科时曾经证明过一次我可以把GPA刷上去,我不需要再证明一次。我希望把时间用来做更有趣的事情:

  • 读更多漂亮有趣的代码
  • 写更多有趣的代码,并争取让这些代码变得漂亮
  • 尽自己的力量为Open Source社区作出贡献
  • 有可能的话,把更多的心得在这里记下来

对我而言,写代码是件快乐的事情。我希望找回那种快乐的感觉,而不是关心写实验报告不光放代码还要放截图的问题>_<18个月前我曾经是一个快乐的软件工程师,18个月后,我愿意继续那条路而不是考虑这些无聊的问题。

顺便吐个槽:In theory, there is no difference between theory and practice. But, in practice, there is. 我现在深刻觉得这句话是至理名言。

总之,我不知道这条路我能持续走多久,我也不知道走下去是不是一定会变好,我只知道:

这是条有趣的路~

这就足够了。