Archive for June, 2008

MSTC Staff 的睡眠趋势

Thursday, June 26th, 2008

大约是从去年寒假的时候开始,我就经常在 cc98 的 MSTC 版上发“晚安帖”,就是每天晚上睡觉的时候发一个帖子说一声晚安,后来版面上时常出现一大堆晚安帖的情况,遭到大家的抗议。 :p 后来只好集中到了一个帖子里面,养成了习惯大家也都时常来说晚安。

不管是早有预谋还是心血来潮,我对这个晚安贴的内容分析了一下,得到了类似于下面的结果:


plot_simple.jpg

其中横坐标是日期,纵坐标是睡觉时间,由于大家都睡得比较晚,所以把第二天凌晨的时间也记作当天晚上(如 25:00 就是第二天凌晨 1:00 )。每一条线就代表一个人的睡眠趋势了。由于并不是每个人每天都去那个晚安贴回帖的,所以每条线的横坐标分布并不是很均匀的。上面是只分析了最初几天的数据的情况,下面则是从今年 2 月 24 日到最近的完整数据分析的结果(点击查看原始大小的图片):

Read the rest of this page »

The 8th week of Schemepy

Saturday, June 21st, 2008

Mithro is just collecting the first-month-status of various Thousand Parsec related GSoC projects. However, since I started early, this is already the end of my 8th week working on Schemepy for me. But as I mentioned before, I spent most of time preparing my final exams. I’ll take my first exam the day after tomorrow. So not too much work has been done this week.

Firstly, I added a (very) simple homepage for Schemepy. It is available here. We’ll setup a shorter URL (like schemepy.thouspandparsec.net) later.

Read the rest of this page »

On the Rubinius FFI

Tuesday, June 17th, 2008

rubiniusContents:

The need for glue code

Ruby is a powerful language, but sometimes you’ll still want to interactive with some native functions written in C/C++. C/C++ and Ruby can not call each other directly, so you’ll need to add a glue layer. There are generally two ways to write the glue layer.

Read the rest of this page »

7th week of Schemepy

Saturday, June 14th, 2008

There’s not too much done in this week. Basically, I looked a bit at the tpserver-py and tried to run it in my local box. If I can successfully run it and test the features, then I can go ahead to port it from pyscheme to Schemepy. In the best case, there should be some regression test cases so that I can simply run the test cases before and after porting to guarantee I don’t broke something. Unfortunately, it seems that I need to test the functionality manually.

Read the rest of this page »

Introduction to direct threading, or computed goto

Saturday, June 14th, 2008

exec现在许多语言都是先把源代码编译成跨平台的字节码,然后通过解释字节码(或是对字节码进行 JIT)的方式执行程序。这比直接遍历 AST 的方式要高效许多。而现在许多虚拟机的字节码其实已经不是以字节 (byte) 为单位,而是以机器字 (word) 为单位(比如 Rubinius 和 Ruby 1.9 的虚拟机 YARV 都是如此),只是仍然沿用“字节码”这一称呼而已,这个原因我稍后会讲。

除此之外,虚拟机一般会分为两种:基于栈和基于寄存器的。虽然在现实的 CPU 上基于寄存器是理所当然的,但是在实现虚拟机的时候却不一样,基于栈的指令集能够让字节码跟紧凑,这样也可以提高 Cache 的利用率,而且实现起来也比基于寄存器的机器要简单。虽然有不少 paper (例如 The case for virtual register machinesVirtual machine showdown: Stack versus registers 等)对比了性能之后都得出基于寄存器的虚拟机“虽然代码大小有所增加,但是带来的性能提升更值”的结论,然而实际上除了 Perl 6 的虚拟机 Parrot 之外其他大部分虚拟机都是基于栈的。

扯了这么多,才要说到今天的正题。虚拟机的实现其实和一个真正的 CPU 差不多,大致有如下几个步骤:

  1. 取指令。在这里是字节码。
  2. 解码,在真实的 CPU 里这个过程会比较复杂,例如 x86 的指令集就是出奇的复杂。而在虚拟机里则不一样,一般 opcode 和 operand 是可以很简单地分开的。
  3. 执行。

如果学过体系结构相关的课程的话,应该了解一个简单的 CPU 大致是如何执行一条指令的,而在虚拟机中,则通常是针对每一个 opcode 会有一段代码完成相应的功能。

Read the rest of this page »

sdcv-mode.el update: now keeps a sdcv process running

Wednesday, June 11th, 2008

Emacssdcv-mode.el is an extension that allows you to look up word in Emacs. It is a front end to the console version of StarDict(sdcv). It has rarely been updated since I wrote it in Dec 2006. Of course sdcv-mode.el isn’t perfect, but I think it just served well at its position. It is very handy to be just a front-end of sdcv.

However, one user opened a ticket recently complaining that sdcv-mode.el is very slow in NT Emacs because it start a new sdcv process each time when looking up a word. It will be slow especially when you have a lot of dictionaries installed. So I changed the strategy: instead of starting a new process each time, I keep a sdcv process running in the background.

Read the rest of this page »

用 Graphviz 来做图的 Visualization

Wednesday, June 11th, 2008

graphviz“编译系统设计”课有一个作业是做一个某语言的 parser ,生成一棵语法树,并用合适的方法把这棵语法树显示出来。我用 Graphviz 来做了 visualization 的部分。这是一个用来做图的 visualization 的很方便的工具,语法树作为一棵树,其实是一个有向无环图了,所以用这个来做其实也是很方便的。

其实作业主要分为两个部分:分析和 visualization 。“某语言”可以是自己定义的,我一开始想做 Scheme 的语法分析,不过后来想想还是算了,那个实在是太简单了,恐怕到时候助教不让过。题目要求用 YACC 或者递归下降的方式进行分析。YACC 我还不会用(暂时也没有要学的打算),所以我用 Treetop 来做 parser ,Treetop 是使用 PEG 进行分析的,其实和传统的递归下降是很像的了。

Read the rest of this page »

[ANN] rmmseg-cpp 0.2.5 released

Sunday, June 8th, 2008

I developed rmmseg-cpp about half a month ago. After running in JavaEye, it is said the performance is good and the memory usage is stable. I’m very glad to see rmmseg-cpp be used in production.

Read the rest of this page »

6th week of Schemepy: mzscheme-3m, guile-1.6 and others

Saturday, June 7th, 2008

As I said in the last status report, although I finished the mzscheme backend, I found the currently commonly used version is in fact of another memory model. My original code was written against the CGC memory model, so I need to re-implement part of the backend to adopt the new 3m model.

At first, I have no idea of how to implement it totally. The mzscheme GC is moving the memory, I need to register all local pointers to it — but it is unrealistic for a Python program. So I asked for help on this. I got answer from both the schemepy mailing list and the mzscheme mailing list — use immobile.

Read the rest of this page »

差点把自己的主目录删掉了

Friday, June 6th, 2008

我一直很讨厌 rm 命令的 -i 选项,一般删除目录都是直接 rm -rf 的,所谓“常在河边走,哪能不湿鞋”,今天差点把自己的主目录给删掉了。

本来是在试用 blueprint ,它的配置文件里面可以指定路径,谁知道它并不展开 ~ 为主目录,运行几次都没有反应,后来我改为绝对路径,才正常了。偶然发现它在当前目录生成了一个叫做 ~ 的目录,真是有些哭笑不得,立马 rm ~ 删除,结果出错了,我根本没有仔细看错误信息,心想:哦!对,这是一个目录,不能这样删。于是我又习惯性地键入 rm -rf ~ ……

过了大约 1 秒钟,没啥反应,我在想这个目录应该很小吧,怎么会删这么久?又敲了一个回车,然后突然反应过来了,难道它在删我的主目录?!立马 ^C 停止了!

2008-06-06-rm-rf.png

赶紧 ls 了一下,似乎应该在的东西都还在,算是虚惊一场,不过以后真得注意了。但是如果这次真的没有什么东西被删掉的话,那两秒钟的时间它在干什么呢?难道是 permission denied 救了我?