July 28th, 2009
New JRuby YAML support with Yecht
A while back I finally got fed up with all our minor YAML incompatibilities. As I’ve been in charge of the YAML support in JRuby for most of the time, this is something I take personally. I’ve written several YAML processors now, and I decided it was time once and for all to make sure we were totally compatible with MRI.
As it happens, the incompatibilities in JRuby’s YAML support can be divided into two categories – the first category are those things that can’t easily be done with JvYAML since they depend on internals of Syck. More and more of these started cropping up, especially for customizing serialization and loading, but also in how the parsing behavior worked and so on.
The second category are a bit more annoying. These bugs are based on invalid YAML that MRI emits or parses even though it is invalid. Syck happens to be a bit loose and nice – and it’s also a YAML 1.0 processor. JvYAML started life as a YAML 1.1 processor, and it was pretty strict. During the last year I’ve crippled JvYAML, making it more 1.0 compatible and less strict to make it closer to Syck. But at the end of the day full Syck compatibility would never be possible from within JvYAMLb.
So I started hacking on Yecht. Two weeks later it is now merged into JRuby trunk. Yecht is a proper port of Syck that matches Syck semantics more or less to the letter – including bugs. Don’t believe me? Just try “YAML::Syck::Map.new(nil, nil, nil).kind” on MRI and “YAML::Yecht::Map.new(nil, nil, nil).kind” on JRuby and see…
As it happens, the story of how I ported Syck is quite interesting, so I will write a separate post about that, focusing on some of the more impressive performance improvements I managed to squeeze out of the parser.
But the short story is this: JRuby’s YAML support is now better than ever, and much more compatible to how MRI does things. All open YAML bugs in JRuby’s bug tracker have been closed, and all tests run as they should.