Google Summer of Code 2008: First Chapter

April 23, 2008 – 12:01 am

Google Summer of Code 2008Finally, I was accepted by Thousand Parsec and will work on Schemepy with Timothy Robert Ansell this summer. I hope this would be an enjoyable summer. :)

It’s been a fairly long time since the announcement of Google Summer of Code 2008. I’ll try to write something about that period. I learned a lot from this and it’s really (some what) by coincidence that I ended up working with Schemepy. :p

Announcement

It was roughly summer of 2007 when Jack told me about Google Summer of Code. But applying period for GSoC2007 has ended that time. Then when I read the announcement of GSoC2008 this year in the Google Reader in late February, I was excited. I checked the website of GSoC and some projects accepted in year 2007, 2006 and 2005.

But I didn’t do any practical thing until the list of accepted mentoring organizations was published. The students had a week to review the project ideas provided by the accepted organizations or make up their own. My laptop battery suddenly stopped to work during those days. :( I have to go to sleep before 23:00 GMT+8 everyday. But many developers are in a totally different timezone, so it had been not very convenient for me when it comes to the communication. But I finally managed to buy a new battery (which is more powerful than the original one, :) and since then, I rarely went to sleep before 26:00 each day :( ).

I found many interesting project ideas on the GSoC2008 homepage:

    git

  • Better Emacs integration for git: This is cool since I like Emacs and git, and I’m quite familiar with elisp.
  • Gittorrent client/server implementation: Gittorrent is a P2P method for distributing git repositories, which will allow for faster downloads, lower startup overheads. I have been interested in networking (especially P2P) and this sounds interesting to me. But I noticed there was a Gittorrent project in GSoC 2007. So I asked about that in the mailing list. Shawn told me that the student accepted last year was unable to actually participate due to legal problems, and it was too late when they found out that.
  • Python

  • Python Markdown, Filesystem Virtualization. Sounds interesting.
  • parabix: High performance text processing interface to python. I found it cool when I first heard the idea of processing 8 bits of a byte in parallel. But the document on the homepage is very limited. So I contacted the people and they were kind enough to give me some brief introduction and the u8u16 project for more details.
  • NLTK, TinyPy, CloneDigger: there seemed to be many interesting Python related ideas.
  • Git plugin for GEdit: in GNOME project.
  • Kompare semantic diff: in KDE project. I think it would be OK if only to develop the interface. But it would not be an easy task to develop many plugins based on the interface. It looks like something in CloneDigger can be borrowed here to write a Python semantic comparing plugin.
  • Amarok Scripting: AmaroK is my favourite media player/manager.
  • Debian netconf: Sounds cool. I looked at the mailing list archive of this project. There seems to be some students getting involved to this project. So I didn’t go any further.
  • Subversion: the page describing their GSoC ideas also contains some non-GSoC projects, I found the language binding project interesting and asked about that in the mailing list. They told me that was not for GSoC. However, I found the svn community quite friendly.

RubyThere’s no Ruby related project ideas! All Ruby related project are supposed to be under the RubyCentral organization. However, the GSoC homepage said RubyCentral hadn’t put up any idea list yet. And the RubyCentral homepage didn’t even mentioned GSoC 2008. I asked in the ruby-talk mailing list about this. Not many people replied. I was just wondering whether Ruby people were aware of that. Though it is encouraged that the students submit their own ideas, I do think the organization should also do something.

Proposals

After collecting the ideas that I felt interested in. I read their document, join some mailing list and contact some people for details. There are some suggestions and guides that are really worth reading.

Then I started to write proposals. I have a bunch of project ideas and some related information collected. Before I wondering which to start, I decided to write a proposal draft of one of my own ideas: PyPeg. In general, it would be a PEG parser generator for Python, like Treetop for Ruby. Then I post the draft to the Python mailing list soc2008-general where GSoC ideas got discussed. Even though I found the idea cool, there were not too many people interested in. I thought most of them would ask why (yet) another parser generator? In fact, I learned that when you propose a project, the most important thing is how it will benefit to the community instead of how cool it is.

I also looked at other projects: I downloaded the Gittorrent RFC document and read it; I downloaded and read some papers on PEGs and packrat parser and finished my PyPeg proposal; etc. But I wasn’t having too much time for GSoC. The final exam for the spring semester was coming and I was busy doing the Software Engineering course project. So the week quickly passed and Google began to accept proposals.

Thousand ParsecOn that day (IIRC) Mithro post the schemepy idea to the PSF gsoc-general mailing list. I click through the link to get a brief idea of the project. I didn’t recognize that the link was on Thousand Parsec page (in fact, I hadn’t heard about Thousand Parsec before that) and was wondering: while Python has a much richer set of libraries, why would people want to embed Scheme in Python. So I asked the questions and start to discuss with Mithro. Finally I realized what Thousand Parsec was and why this project was needed.

I asked the status and goals of Schemepy project and write a proposal for this. I knew Scheme and Python and felt the project interesting. Though I never felt that this proposal would be accepted, I do write the proposal carefully. After I send my proposal to Mithro, he helped me much in improving it.

In fact, I would like very much to work on something related to Ruby for this summer. While there weren’t any project idea listed, I had my own ideas:

  • Porting various libraries from Ruby 1.8 to Ruby 1.9.
  • Improve the Garbage Collector of Ruby.
  • Ruby memory profiler.
  • etc.

In fact, when I wrote RMMSeg, I was feeling very disappointed at the performance and memory usage of Ruby. So I’d like to improve this eagerly. And finally I picked the memory profiler idea and write the proposal. The main goal would be to improve BleakHouse and ruby-prof.

Then finally, RubyCentral posted their suggested project ideas. The introductions are very short (many of them were even one-line). And only one project description contained a link for further information. Very few mentioned who proposed the idea and who was going to be contacted for more information. I skimmed the page and found Merbtastic, Merblets and mod_rubinius interested. But the information was very limited, e.g. the full description for mod_rubinius was:

Assist in development of mod_rubinius.

And in the Merbstatic description, it was said “If you need more info, Wayne or myself can answer questions.”. But no link, no email address. Who is myself?

I googled and found the email address of Wayne, sent him (and the merb mailing list) an email asking for information about Merbstatic and Merblets. No reply.

I found nothing related to mod_rubinius on Rubinius’ homepage and issue tracker. But I heard some time earlier that rue would be the official developer for mod_rubinius. So I asked him about the status and goals. He was very kind to explain what mod_rubinius would be (I wrote the proposal based on his description). But it seemed the project hadn’t started yet, the goal of mod_rubinius as a GSoC project was not clear and he said he won’t be the mentor even if that was accepted as a GSoC project. I was wondering he was totally unaware of mod_rubinius being posted to the project idea list. I wouldn’t submitted the proposal had I know that earlier.

Then RubyCentral added more ideas to their list, where I found the ruby debugger project. That was the project with most detailed description. Though I can still not find who proposed this idea or who to contact, I managed to get enough information from the description and links to write my ruby debug proposal.

That’s all my proposals:

  • PyPeg: My own idea, seemed no one interested in.
  • Schemepy: I submitted it to both PSF and Thousand Parsec. Mithro said it was a good proposal. But that project was at a lower priority. They would accept me if they got enough slots from Google.
  • Ruby memory profiler: My own idea, also no one expressed any interests.
  • mod_rubinius: …
  • Ruby debugger: Rocky contacted me and I thought this proposal the one most likely to be accepted.

Initial work

Google extended the proposal submitting period for a week. But I didn’t write any more proposals after the ruby debugger one. One reason was writing a proposal needs really many work. The other reason (and the main reason) was that I started to work on the debugger proposal.

In fact, Rocky (who initially proposed the debugger idea) contacted me and helped me to get my proposal improved very much (I rewrote it totally). And then he asked some very specific questions (like an interview) like whether I was familiar with data structure and whether I was comfortable programming in C etc. I think we were both satisfied with the communication. Rocky gave me commit access to his repo and I created a branch for GSoS2008.

I enjoyed very much working with Rocky. I think he is a very good mentor. After successfully let me understand enough ideas to make a start, he helped me to break the tough thing into small pieces. Then I got started very easily:

  • Get an overview of YARV instruction set and write a report.
  • Write an extension to access and modify the instruction sequence of YARV.
  • Make the instruction set related information available from Ruby, also a publicly accessible instruction table.
  • Investigate YARV code about the dangling pointer problem if we re-allocate the instruction sequence block.
  • Write a program to patch the instruction sequence to get trace hook called.
  • etc.

I used to think I would end up working with Rocky on the debugger this summer. Then Evan release a new version of BleakHouse, with many great improvements. In fact, all the things that I mentioned in my proposal were fixed. I blogged about it and joking that my GSoC proposal of memory profiler would be invalid with the release of BleakHouse v4. Evan read my blog and left a comment saying that it’s not true — there’s still many things to be done, and he would be happy to work with me. I said thanks to him. I thought if he had told me earlier (then if I had written the blog post earlier, and then if he had released BleakHouse v4 earlier :p ), I would be very happy to work on that. But I had been working happily work Rocky.

Ranking

stopwatch.jpgThe delayed deadline finally ended on April 7. I was still working on the debugger. ruby-debug 0.10.1 released. Ruby 1.8.7 preview released. And the ranking of students’ proposals started. But it’s not strange there’s still students asking how to apply the program. Plenty of them, from the start to the end, never read the FAQ I guess. Many of the posts in the GSoC mailing list were that kind, very annoying.

Then Rocky told me that both of my proposals on Ruby side had mentor assigned. If I had any priority I should let them know as earlier as possible. So I posted my preference (I chose the debugger project) to the comments. At least Evan noticed my comments.

Another several days later (I was taking my final examinations), Rocky told me that other mentors favored the memory profiler project. While the memory profiler project was top ranked, the debugger project was put just below the line to be accepted. I was quite surprised at first. I had asked in the GSoC mailing list how the conflict would be solved if two or more proposal of a student was likely to be accepted. They said the student’s preference would be taken into consideration. But I learned here again that the community’s benefit is more important than your favor.

I felt disappointed, but I think I was having good attitude to the project. I told Rocky that even if it would not be accepted as a GSoC project, I would still be glad to work on that (but of course, not full time). In fact, both of us were taking the project seriously because Google or RubyCentral were willing to see us (or I) to make some final results if accepted as a GSoC project. But if it became a personal hacking, we could relax and more cool ideas can be considered.

For example, the YARV trace instruction is 2-word long, which make it very hard to replace some shorter instructions. We discussed many workarounds but all of them had some defects yet hard to implement. The best solution would be patch the VM. But I think YARV don’t want itself to be patched by various projects, especially when it is actively evolving. But if this is only a personal hacking, I wouldn’t care that much. :D

So I continue to work on the debugger project. But there seemed no doubt that this wouldn’t be a GSoC project. Instead, I was supposed to work on the memory profiler project (with top ranking).

On April 19, I received an email from Pratik to Pat, Evan and me saying that all my proposals on RubyCentral side were down rated to zero. A friend of a friend of mine (Xue Yong Zhi, a Chinese mentor on the Ruby side) also told me this problem. There seemed to be a lack of communication between students and would-be-mentors as well as between mentors themselves.

In fact, there was an IRC meeting for conflict resolving on April 18, but the Ruby people didn’t show up. So my proposal at Thousand Parsec won and my other proposals were rated down to zero automatically.

I contacted Mithro, Pat (admin of RubyCentral in the GSoC event) and Evan, told them I prefer memory profiler to Schemepy. Mithro said it would be OK. Pat said he would also contact Mithro and this conflict can be resolved.

I didn’t know what had happened then. But when Mithro talked with me on GTalk on April 21, it seems the situation didn’t changed. He told me if he rate me down at the Thousand Parsec but the people at RubyCentral side didn’t rate me up before the deadline (only about 5 hours left), I would end up being accepted by no organization. That’s really too bad. But I can’t contact Pat that time. All I can do is sent an email to Pat and Evan. They need accurate cooperation with Mithro to make things done.

Then I went to class. After I return from the class, there was still no reply for my email. The deadline on Google Timeline is April 21: ~12 noon PDT / 19:00 UTC. I was not quite sure what time would be in my area. But guessing from the various posts to the GSoC mailing list, it might be 03:00 am in the night. Many students started to post the GSoC mailing list talking about their current feeling. Several Chinese students even started to talk about Beijin’s weather, on the public mailing list, which was subscribed by thousands of people.

I didn’t want to wait until the announcement moment. Even if I ended up working with the Schemepy project, it would also be a nice summer. I had classes the next day, so I went to sleep.

Acceptance

The next day when I got up and opened my GMail inbox. I found a bunch of unread messages. Most of them are from/to the GSoC mailing list. passed.gifIt’s hard to found the announcement message from them, so I first checked the mail I sent to Pat and Evan the previous day. Evan replied my email. I don’t know whether it was too late or he had no permission (he isn’t the administrator and his mentorship status for my proposal got cleared) to rate me up. But the final result was what I mentioned at the very beginning of this post: I’ll work for the Schemepy project this summer. I said thanks to all the mentors that had been working to solve my problems and went to class.

All accepted students are very exciting. I’m also exiting, that’s why I write this blog post. The private GSoC project for the accepted students and mentors were soon flooded by hundreds of messages of Hi, Hello or Self introduction.

So this is the end of the 1st chapter of my GSoC2008. I learned a lot from this chapter. I got to know a lot of nice people. And I’m feeling a little disappointed (from the very beginning to the end) about the organization of this year’s GSoC in the Ruby community. However, I’d still thank everyone for their efforts.

I’ve sent Mithro my public ssh key. I’d like to get started within several days (I just started a new semester this week and it might be a little busy in the first several days). Then I’ll go to the 2nd Chapter. :)

  1. 7 Responses to “Google Summer of Code 2008: First Chapter”

  2. There definitely were a lot of communication problems on the side of Google and the RubyCentral administration/mentors. I tried to contact Mithro yesterday but by the time I checked my email he was asleep (since he’s in Australia).

    I hope you like your Python project; sorry for all the drama.

    By Evan on Apr 23, 2008

  3. Nice story about your experiences with the GSoC application process. It’s hard every year. Not sure how many slots you will get, different timezones, conflicts… I don’t even have any idea how to make it all more smooth next time.

    Anyways, I’m glad you have made it in. We are looking forward to working with you on Thousand parsec. I understand it is not at the top of your preference list, but I hope you will enjoy working on Schemepy.

    By Jure Repinc on Apr 23, 2008

  4. @Evan,
    That’s alright. But I think we (the Ruby community, at least those people who are behind the GSoC event) should also learn something from the whole process.

    I totally agree with what Jure said — not sure how many slots you will get, different timezones, conflicts — always hard to deal with. But we can definitely do better when we devote more time/energy.

    I also hope Ruby community has a nice summer for GSoC. And hope it can do better next year. :)

    By pluskid on Apr 23, 2008

  5. @Jure,
    Thank you for that kind words. I’m also looking forward to working with the whole TP community. I’ll enjoy my summer. Wish you enjoy, too. :)

    By pluskid on Apr 23, 2008

  6. There are 9 projects under Ruby Central this time.
    http://code.google.com/soc/2008/ruby/about.html

    By spritesun on Apr 23, 2008

  1. 2 Trackback(s)

  2. Jan 10, 2009: Free Mind » 2008 ^L 2009
  3. Mar 15, 2009: Free Mind » 法喜寺斋饭之旅

Post a Comment