[ANN] RMMSeg 0.1.4 Released

March 2, 2008 – 8:02 pm

我今天发布了 RMMSeg 0.1.4 版。性能有少许提升,现在使用 ComplexAlgorithm 大约有 20KB/s 的速度,而 SimpleAlgorithm 差不多是 60KB/s 。我在一个 branch 里用 C 来实现了一部分代码,减少了许多 String 的构造,可是效果并不明显,让我非常失望,而且由于我在前一篇 Blog 里提到的 Ruby 关于 Hash 的 Bug ,需要打过 patch 的 Ruby 才能正常运行,所以并没有把带 C 扩展的版本包含在这个 Release 中。

下面是 RubyForge 上的 Release announcement:

rmmseg version 0.1.4
by pluskid

http://rmmseg.rubyforge.org

== DESCRIPTION

RMMSeg is an implementation of MMSEG Chinese word segmentation
algorithm. It is based on two variants of maximum matching
algorithms. Two algorithms are available for using:

* simple algorithm that uses only forward maximum matching.
* complex algorithm that uses three-word chunk maximum matching and 3
additonal rules to solve ambiguities.

For more information about the algorithm, please refer to the
following essays:

* http://technology.chtsai.org/mmseg/
* http://pluskid.lifegoo.com/?p=261

== CHANGES

* Let user store their customized word to Dictionary after loaded.
* Improved performance of SimpleAlgorithm.

  1. 5 Responses to “[ANN] RMMSeg 0.1.4 Released”

  2. 如果对性能要求比较高的话,用 ruby 会比较不爽吧。。看过的性能评测中,ruby 总是排在 n 门语言后面。。

    By xwl on Mar 2, 2008

  3. 恩,确实 Ruby 的速度比较慢,不过也是先用 Ruby 写出来了,才发现效率有问题的。 :p

    By pluskid on Mar 2, 2008

  4. 中间有一个aditonal,是不是additional啊?

    By quark on Mar 3, 2008

  5. @quark:
    *^_^* 是的~~ :p

    By pluskid on Mar 3, 2008

  1. 1 Trackback(s)

  2. Apr 24, 2008: 荐书(1) | 做最好的自己

Post a Comment