[Schemepy] More about the mzscheme backend

June 2, 2008 – 2:51 pm

I finally finished the mzscheme backend. As I said before, I should have finished it before writing the last week’s status report. However, I’m still pleased to see the result though a little late than scheduled. So I’ll write a follow up of the last status report.

With the experience of implementing the guile backend. It should be not too hard to write yet another backend. It is true for most parts. However, there are many differences between different Scheme implementations. Here are some examples:

  • The type system. How would a generic Python object represent in Scheme? They are just normal Python objects in pyscheme. In guile, you’ll register a SMOB type to handler this. And registering a new Scheme_Type is necessary in mzscheme.
  • Exception handling. First you should catch the exceptions raised in Scheme gracefully instead of crash the whole VM. Then you’ll need to map the native exceptions to various pre-defined Schemepy Exceptions with respect the the specification.

    Our goal is that the user won’t see any difference when the backend changed. So the same exception should be raised when all other situations are the same (except with a different backend). It’s not too hard to cover 80% cases. But it will become very hard to achieve more. For example, the Scheme code `(,,(+ 1 2)) will raise unbound variable error when evaluated in guile, but syntax error will be raised when compiling it in mzscheme. Different exceptions are raised from different stages. It’s really hard to uniform those behaviors.

    My current solution to this kind of problem is to adjust the API/testcases. It is not guaranteed whether syntax error or unbound variable error will be raised. Nor is it guaranteed the exception would be raised when compiling or evaluating. But at least a SchemeError (this is the ancestor of all Scheme exceptions) will be raised when compile and evaluate the code.

  • Memory management. Or more specifically, garbage collecting. This is always one of my biggest headache.

    It is described in the document that mzscheme has two GC: CGC and 3m, and

    … overall system performance is typically better with MzScheme3m … most users now run MzScheme3m.

    However, after installed mzscheme with apt-get, I didn’t find any 3m related files/libraries. I then assumed that the new model hasn’t been widely used yet (this is approved to be false, I’ll cover this later). So I decide to use the CGC, which seemed to be a little easier to use.

    However, the garbage collector is scaring me — they may move objects in memory during a collection. Pointers need to be registered to the GC so that they can be fixed when objects are moved. Again, I think it is not a problem in C, but I don’t know how things will go in Python with ctypes.

    In fact, what prevent me from finishing the backend before the last status report is the problem with GC. The program just fails silently, or raise some arbitrary strange exceptions. I created several branches to test various ways of doing exception handling, namespacing and various stuffs that seemed to be failing. Fortunately, I finally found the problem with the GC: I failed to register some global Scheme values to the GC to prevent it from collecting them. I think should learn something from this: if I had read the document on scheme_register_extension_global carefully, this kind of problem wouldn’t happen.

  • etc.

If you look at those examples above carefully, you’ll notice they all have a common property: one is dealing with two similar systems with C (which is very much different) as the bridge:

  • The type system of Scheme and Python (both duck typing).
  • The exception system of Scheme and Python. Exceptions need to be handled in C (where there’s no exceptions) and then converted to another exception system. Here C is more like a bottleneck than a glue. More over, currently only one way passing is implemented — exceptions raised in Scheme will be caught and raised again in Python. But the behavior of a Scheme code calling a Python function who then raises an exception is still not defined.
  • The GC system of Scheme and Python. Again, very very careful cooperation is needed here to deal with two GC system in C — a language where all memory are managed explicitly.

However, if everything is trivial, we’ll be left with nothing to do. :p And finally there’s the mzscheme backend. The benchmarks results is very good although I think I used a slow way of handling exceptions (I didn’t include the pyscheme backend, because it takes too long to run):

pluskid@benchmark $ python bm_load_vm.py
Time to load the VM ------------------------------------------------------------
                   user      system        real       total
guile          0.040000    0.050000    0.120000    0.210000
mzscheme       0.030000    0.050000    0.250000    0.330000
[19721 refs]
pluskid@benchmark $ python bm_tailloop.py
Tail call performance ----------------------------------------------------------
                  user      system        real       total
guile          0.006000    0.002000    0.007000    0.015000
mzscheme       0.002000    0.000000    0.002000    0.004000
[19732 refs]
pluskid@benchmark $ python bm_tak.py
tak benchmark ------------------------------------------------------------------
                  user      system        real       total
guile         11.996667    0.026667   12.110000   24.133333
mzscheme       2.933333    0.010000    2.953333    5.896667
[19733 refs]
pluskid@benchmark $ python bm_strcat.py
string-append and substring performance ----------------------------------------
                  user      system        real       total
guile          0.012000    0.026000    0.038000    0.076000
mzscheme       0.076000    0.026000    0.103000    0.205000
[19732 refs]

Besides, mzscheme support compiling. I’m planning to write some benchmarks to test the performance improvements brought by this feature. Generally, here’s my current tasks (I’ll try to keep several hard tasks simultaneously, if one of them stalled due to some difficulties, I can pick up another):

  • Benchmarks for testing the performance improvements brought by compiling the Scheme code.
  • Improve the benchmark utility by using the timeit module.
  • Port the current code of mzscheme backend from CGC memory model to 3m model. I said that I thought CGC is more common. But I found later that it’s just my version in Debian lenny is too old. The packages in Debian unstable, some reason. Mithro has provided me some detailed information after that. I would also like to pick up this task when possible.
  • Various tasks with lower priorities. There are still 7 todo items in the log_book, as well as other not listed tasks like Guile 1.6 support, Windows support, etc.

Although sometimes it is frustrating to see segfaults all over the screen, the whole process has been interesting, especially when you finally find the way to solve the problem. I’ve also learned a lot from this.

One example is the altitude to the test cases. I always admire TDD/BDD, like to write tests, and especially glad to see all the test cases passing without problems. But I’m a little sensitive about the non-passing test cases. When I see some non-passing test cases, I have really strong will to make them pass. But sometimes I will change (or even remove) the test case merely to make it pass.

E.g. the test case of (= 1) is asserted to return #t. But exception is raised complaining wrong number of arguments are provided in mzscheme. There are different behaviors in different implementations. It is almost impossible to modify the native implementation to fix it (even if your patched get accepted, it would takes too long to let the new version with your patch become widely used), so I decided this is undefined behavior and removed the test case.

But I was wrong, even though it is undefined behavior, the test case should still be there. It is generally OK to have non-passing test cases. And the goal of a test case is not to pass, but to broke your code. Because 1000 (or even more) test cases passed won’t prove your program right, but 1 test case failed is enough to tell you the problem of your program.

Another important things I’ve learned from this (and from the whole open source community) is that source code is always good document. Sometimes it will be more clear to look at the source code directly than reading a bunch of long manuals. And open source makes it possible!

Here’s the snippets I found in the mzscheme source code to check whether the parameter is a valid list:

int scheme_list_p(Scheme_Object *o)
    Scheme_Object *obj1, *obj2;
    obj1 = obj2 = o;
        if (SCHEME_NULLP(obj1))
            return 1;
        if (!SCHEME_PAIRP(obj1))
            return 0;
        obj1 = SCHEME_CDR(obj1);
        if (SCHEME_NULLP(obj1))
            return 1;
        if (!SCHEME_PAIRP(obj1))
            return 0;
        obj1 = SCHEME_CDR(obj1);
        obj2 = SCHEME_CDR(obj2);
    while (NOT_SAME_OBJ(obj1, obj2));
    return 0; /* circular list */

It detects circular list cleverly while still keeps the common case (without circular list) fast.

Post a Comment