Destructuring extravaganza


A few months back I added support for destructuring assignment and tuples to Ioke. Since Ioke’s assignment is just a regular method call, this was actually fairly easy to do. The end result is that you can do things like (x, y) = (13, 14). You can also do more interesting things, such as ((x, y), (x2, y2)) = [[1,2],[3,4]]. Notice that the right hand side is not a tuple anymore, but a list. Anything that can be turned into a tuple using the asTuple method can be on the right hand side, or an item in a recursive destructuring.

All this functionality makes code slightly more readable. But last week I decided to add support for eachCons and eachSlice, and suddenly I realized that destructuring would be very nice to have not only in the explicit assignment case, but also in cases where you want to pick apart the arguments to an enumerable or sequence method. So I added those, which means that suddenly lots of code becomes much more simple.

Short story, in all Sequence and Enumerable methods, at every place where you could put an argument name, you can now put a destructuring statement instead. Let’s take a look at an example:

Point = Origin with(asTuple: method((x, y, z)))

points = [
  Point with(x: 42, y: 14, z: -44),
  Point with(x: 20, y: 0, z: 444),
  Point with(x: 31, y: 646, z: 3),
  Point with(x: 456, y: 14, z: 12)
  ]

distances1 = points consed map(obj,
  ((obj[0] x) * (obj[1] x) +
    (obj[0] y) * (obj[1] y) +
    (obj[0] z) * (obj[1] z)) sqrt)

distances2 = points consed map(
  ((x1,y1,z1), (x2,y2,z2)),
  (x1*x2 + y1*y2 + z1*z2) sqrt)

distances1 inspect println
distances2 inspect println

This code first creates a Point that can be coerced into a tuple of x, y and z coordinates. We then create a list of Points with different coordinates. We then want to calculate the three distances between the four points. We do this in two ways, using the old method and then using destructuring. The method consed is a sequence version of eachCons. The default cons length is 2, so this will yield three entries with two points in each. We then call map on the sequence. We will get a List of two entries, where each entry is a point. Finally we use Pythagoras to calculate the distance.

The second version is very similar - the only difference is that instead of using the square brackets to index into the lists, we instead give a pattern. This pattern contains two patterns, and the variable names inside of it will be bound to the right parts of each point.

At least in my mind, the destructured syntax is much more readable than the original one. And remember, this works for anything that can be turned into a tuple, which means you can use it on any Enumerable - you can use it on a Pair (such as what a Dict will yield) or any thing you would want to add asTuple to on your own.



Ioke P released


I am very happy to announce that Ioke P has finally been released!

Ioke is a language that is designed to be as expressive as possible. It is a dynamic language targeted at the Java Virtual Machine. There also exists a version for the CLR. It’s been designed from scratch to be a highly flexible general purpose language. It is a prototype-based programming language that is inspired by Io, Smalltalk, Lisp and Ruby.

Homepage: http://ioke.org
Download: http://ioke.org/download.html
Programming guide: http://ioke.org/wiki/index.php/Guide
Wiki: http://ioke.org/wiki

The two specific releases that encompass Ioke P are ikj 0.4.0 and ikc 0.4.0.

Ioke P is the fourth release of Ioke. It includes many new features compared to Ioke E:

  • Number Infinity
  • eval
  • Reflector
  • Hooks
  • First class Runtime
  • New parser
  • Tuples
  • Structs
  • Destructuring assignment
  • Message rewriting
  • Functional composition
  • Sequences
  • Dictionary and Set versions of Enumerable methods
  • Enumerable group, Enumerable groupBy
  • Set operations for union, intersection, membership, subset and superset testing
  • ISpec stubbing and mocking
  • IIk history
  • DokGen on separate projects

Ioke P also includes a large amount of bug fixes.

Features:

  • Expressiveness first
  • Strong, dynamic typing
  • Prototype based object orientation
  • Homoiconic language
  • Simple syntax
  • Powerful macro facilities
  • Condition system
  • Aspects
  • Java integration
  • Developed using TDD
  • Documentation system that combines documentation with specs
  • Runs on both the JVM and the CLR

The many things added in Ioke P could not have been done without the support of all the Ioke contributors. Thank you!

Regards
Ola Bini    - ola.bini@gmail.com



Should languages be multi-lingual?


I’m currently sitting in the Beijing ThoughtWorks office, and for some reason language is on my mind… =)

One of the discussions related to DDD that have turned up several times the last few months at conferences
is how you handle ubiquitous language when your domain is not in English. Since most programming languages are based on English, you end up mixing English and Swedish for example, if you are working with a Swedish domain. Of course, the benefits of working with these concepts in Swedish are very hard to argue against. But the dichotomy between the programming language and the domain language is definitely something that hurts my eyes, so I’m generally not very fond of that approach.

In fact, I haven’t heard anyone come up with a good solution to this problem, and this post is not really a solution either.

One of the things I’ve proposed to make this situation better is to create an external DSL that is fully in the domain language. The implementation of that DSL can then be implemented in English. The main benefit is that there is a clear separation.between the domain language and the programming language. On the other hand, the overhead of creating the DSL and also the complexities involved in translating the domain concepts into programming language concepts can become problematic too.

One interesting idea in Cucumber is the idea that you can easily add new natural languages to write the features in. When it comes to user stories at the level of testing that Cucumber provides, it’s really important to use the right language. So it got me thinking, could you use the same kind of approach in a general programming language too?

As an experiment I took a small example program for Ioke, and translated it into Mandarin, with simplified Chinese characters. Of course I used Google Translate for this, so the translation is probably not very good, but the end result is still interesting. I’m not going to try to get this into my blog, so take a look at the file at github instead: http://github.com/olabini/ioke/blob/master/examples/chinese/account.ik. As you can see there is nothing in there that even reeks of English. If you don’t understand Chinese characters it is probably hard to see what’s happening here. Basically an Account object is created, with a “transfer” method and a “print” method. Further down, two instances of this Account object is created, some transfers are made, and then the objects are printed. But provided my translation is not too crappy, this code should make sense to someone reading Chinese.

Now, this is actually extremely simple to implement in Ioke, since it relies on several of the features Ioke handles very easily. That everything is a message really helps, and having everything be first class means I can alias methods and things like that without any worry. Obviously your language also need to handle non-ascii identifiers correctly, but that should be standard in this day and age.

When thinking about it, something similar to do this can be created in languages like Lisp, Smalltalk, Factor, Io and Haskell - but most other languages would struggle. If you have keywords in your language, it’s really a killer - you would need to branch your parser to make it happen.

Of course, this approach only works when you can simply translate from one word to another. If the writing system is right to left, or top to bottom, it’s much more tricky to create a good translation.

I’m also not sure if this is actually a really good idea or not. It might be. The other thing I’ve been thinking about is how to handle multilingual editing. What if you want to be able to switch back and forth between languages? How can you handle identifiers with more than one name. Would you want to?

Lots of unanswered questions here. But it’s still funny to think about. Communication is the main goal, as usual.



Ioke sequence support


The last two weeks I’ve been working on adding external iterators to Ioke. This work is now done and merged, so I thought I’d just describe it a bit.

But first, why do I need explicit iterators in Ioke? Ruby has gotten by without them for a long time, only implementing a Generator library using continuations, in the standard library. It’s pretty nice, since you don’t really need to do anything explicit to get external iterators from internal ones. Of course, the problem is that it’s very inefficient to implement them like this. So I decided that Ioke should have an explicit protocol for external iterators. You can implement internal iterators using external ones efficiently, but not the other way around.

The two major objects for this in Ioke is called Sequence and Mixins Sequenced. Sequenced is the mixin that gives you access to several helper methods if you implement the “seq” method. If you implement “seq” and mixin Sequenced you will also get an “each” method and Enumerable. The “seq” method is expected to return something that mimics Sequence and has one “next” method, and one “next?” method. That’s all. The “next?” method returns true if there is another element in the sequence, and “next” returns the next one. The protocol is undefined if you call “next” when “next?” would have returned false.

Sequenced give you an “each” method that in addition to the regular each-protocol will also return the result of calling “seq” if you don’t give any arguments to “each”.

Except for that, you will get several methods that just call “seq” and calls the same method on the result of that. These methods are: “mapped”, “collected”, “filtered”, “selected”, “grepped”, “zipped”, “dropped”, “droppedWhile” and “rejected”. These methods are also the same as exist on Sequence. These methods return new sequences that implement the same behavior as the methods with similar names on Enumerable.

Finally, Sequence also mimics Mixin Enumerable. Once you call one of the Enumerable-methods, the whole sequence will be realized, or as much as is necessary to give an answer. A small example of how you could use it:

(1..100000000) mapped(x, x*x) filtered(x, x % 3 == 0) takeWhile( < 10000 )

This example creates a range from 1 to 100,000,000 and finds all the squares that are less than 10,000 an d that is evenly dividable by 3.



A new parser for Ioke


Last week I finally bit the bullet and rewrote the Ioke parser. I’m pretty happy about the end result actually, but it does involve moving away from Antlr’s as a parser generator. In fact, the new parser is handwritten - and as such goes against my general opinion to generate everything possible. I would like to quickly take a look at the reasons for doing this and also what the new parser will give Ioke.

For reference, the way the parser used to work was that the Antlr generated lexer and parser gave the Ioke runtime an Antlr Tree structure. This tree structure was then walked and transformed into chained Message’s, which is the AST that Ioke uses internally. Several other things were also done at this stage, including separating message chains on comma-borders. Most significantly the processing to put together interpolated strings and regular expressions happened at this stage. Sadly, the code to handle all that was complex, ugly, slow and frail. After this stage, operator shuffling happened. That part is still the same.

There were several problems I wanted to solve, but the main one was the ugliness of the algorithm. It wasn’t clear from the parser how an interpolated expression mapped into the AST, and the generated code added several complications that frankly weren’t necessary.

Ioke is a language with an extremely simple base syntax. It is only slightly more complicated than the typical Lisp parser, and there is almost no parser-level productions needed. So the new parser does away with the lexer/parser distinction and does everything in one pass. There is no need for lookahead at the token level, so this turns out to be a clear win. The code is actually much simpler now, and the Message AST is created inline in the new parser. When it comes to interpolation, instead of the semantic predicates and global stacks I had to use in the Antlr parser, I just do the obvious recursive interpolation. The code is simple to understand and quite efficient too.

At the end of the day, I did expect to see some performance improvements too. They turned out to be substantial. Parsing is about 2.5 times faster, and startup speed has improved by about 30%. The distribution size will be substantially smaller since I don’t need to ship the Antlr runtime libraries. And building the project is also much faster.

But the real gain is actually in maintainability of the code. It will be much easier for me to extend the parser now. I can do nice things to make the syntax more open ended and more powerful in ways that would be very inconvenient in Antlr. The error messages are much better since I have control over all the error states. In fact, there are only 13 distinct error messages in the new parser, and they are all very clear on what has gone wrong - I never did the work in the old parser to support that, but I get that almost for free in the new one.

Another thing I’ve been considering is to add reader macros to Ioke - and that would also have been quite painful with the Antlr parser generator. So all in all I’m very happy about the new parser, and I think it will definitely make it easier for the project going forward.

This blog post is in no way saying that Antlr is bad in any way. I like Antlr a lot - it’s a great tool. But it just wasn’t the right tool for Ioke’s syntax.



Continuous Integration for Ioke with Cruise


I’ve felt the need for this since I put out the CLR version of Ioke, and now I’ve finally managed to make it happen. Even though I’m the only person with commit rights to Ioke so far, it is still good to have continuous integration running, especially since there are at least seven different builds I want to test, 3 on linux and 4 on windows.

I now have two servers running this. They are not public right now - I will post something when the dashboard is up - but the CI server will send notification emails to the ioke-language Google Group with status.

The current setup tests Java 1.5, Java 1.6 on Linux and Windows. It tests Mono on Linux and Windows, and .NET on Windows.

As a CI server I’m using Cruise, ThoughtWorks own Continuous Integration server. Cruise is a commercial product, but open source projects can use it for free. I’m very happy about it from earlier projects, which is why I decided to use it for Ioke.

ThoughtWorks also gave me two virtual machines to run this CI server - which I’m very grateful for.



Videos from the Chicago ACM Ioke talk


This Wednesday I gave a talk about Ioke at the Chicago ACM. This was actually great fun and I’m fairly happy with the presentation. This is without doubt the best quality Ioke presentation available so far.

You can see it here: http://blip.tv/file/2229441

And here: http://blip.tv/file/2229292



Google I/O


Currently sitting in a session on day two of the Google I/O conference. The morning opened up with the keynote and announcement of Google Wave, which is something that seems very cool and has a lot of potential. Very cool start of the day.

After that I watched Ben and Dion talk about Bespin. I hadn’t seen Bespin before - it was definitely interesting, although I will be hard pressed to give up Emacs any day soon.

During lunch I came up with a fun idea, but it required something extra. I talked to Jon Tirsen, a Swedish friend from his ThoughtWorks days, who is on the Google Wave team - and he managed to get me an early access account for Google Wave. So I spent the next few hours hacking - and was able to unveil an Ioke Wave Robot during my talk. It is basically only a hello world thing, but it is almost certainly the first third-party Google Wave code… You can find it at http://github.com/olabini/iokebot. It is deployed as iokebot@appspot.com so when you have your Wave account you can add it to any waves. Very cool. I do believe there is a real potential for scripting languages to handle these tasks. Since most of it is about gluing services together, dynamic languages should be perfectly suited for it.

Finally I did my talk about JRuby and Ioke - that went quite well too. The video should be up on Google sooner or later.

And that was basically my Google I/O experience. Very nice conference and lots of interesting people.



Communication over Implementation


Last week I wrote a post about some of the statements that percolate in my mind when designing Ioke. What I didn’t mention was that these ideas are things I use to judge other programming languages too. I would say that this philosophy pretty much captures my views on programming languages. (The post in question is here: The Ioke Philosophy). So, this post is of course not complete, and I don’t think I would ever be able to write something that is totally complete.

One of the things missing - and I did allude to it in the post - was a statement that has grown on me a bit. I did a presentation about Ioke last week, and at that point I decided I needed to talk about this some more. The statement in question is what I call Communication over Implementation. This turns out to be pretty important for programming languages in general, at least in my experience.

One of the things I’m fond of saying when talking about programming languages, is that programming languages - just like natural languages - are about communication. And we don’t necessarily always think clearly about who we are communicating with. The immediate and intuitive reaction to programming languages is that they are supposed to communicate with the compiler/interpreter/cpu. That is of course true, but it is also incidental in many cases. There are many ways in which you can communicate with the machine to get it to achieve something. So the question becomes what other parties should you consider when communicating.

The next most obvious party would be yourself. If you ever need to read your code, you need to write code in such a way that you can read it later on. This constrains the way you write code quite severely. There are reasons we don’t write much code in assembly language or JVM bytecodes anymore. Yes, at some level these descriptions are extremely nice communication towards the executing machine, but they are so bad at communicating with human stakeholders that the balance generally ends up in favor of more readable languages.

When communicating with human stakeholders the thing I focus most on is intent. If your code/text communicates the intent of what you are trying to do in a good way, this makes it easier to read. There are many movements that focus on how to do this well, where domain domain design and clean code are the two that immediately comes to mind for me.

So coming back to the title. For me, implementation is a special sort of communication - that kind of communication that is supposed to be functional and describe what should actually be done in deterministic instructions to a machine. As long as this communication is functional enough - meaning that the machine does more or less the right thing - there is much leeway in how the code can be written to make other kinds of communication easier. And that is the core of this argument. A language should make it easy to communicate with other stakeholders than the machine, since those other forms of communication with code is actually much more important than only the implementation pieces. Yes, if the implementation works your program might run for a while - but if no one can read the code it can’t be maintained, it can’t be understood except in a black box way, and the utility of the system will be limited.

Go the other way. If you have a program that communicates badly with the machine (but still well enough according to the above definition), but it is written in a clearly communicating way, this means that it is easier to grow the system, it is easier to fix it or implement it more correctly. It can also easier be replaced since the program communicates what it is doing.

There are exceptions to this principle. But we seem to to favor languages that are focused on implementation and only incidentally on communication. This is the wrong choice and it need to be fixed. Communication is at the core of programming, and should also be the focus of it.



The Ioke philosophy


I have in various circumstances used a list of statements, a kind of Ioke manifesto, that tries to give the spirit of what kind of guidelines I use when designing Ioke. Some of them are very serious, and some … well, more in jest. I thought I’d expand on them a bit there. In the tradition of these kind of manifestos, Ioke values the thing on the right, but values the thing on the left much more.

Oh, and remember that these ideas… They are really my ideas and thoughts and values. Nothing else. It is not in any way objective. So please don’t take offense, get riled up or start any holy wars.

Expressiveness over performance

This is really the full manifesto of Ioke, if you ever had to choose just one. In any situation where I have to choose between expressiveness or performance, expressiveness is always the answer. The side effect is of course that Ioke is not a fast language at all. But I believe it is one of the most expressive you can find - at least if you measure expressiveness the way I do.

Abstraction over low level interfaces

A special case of expressiveness is abstraction. When I get drunk and rant about programming languages (something that happens all too often) one of the words I use all the time is “abstraction”. As it turns out, being able to abstract things - no matter what they are - is one of the most important things in a programming language - for me. The ability to abstract objects and classes of objects is one of those. The ability to abstract functionality is another. The ability to abstract structure is a third. The ability to abstract syntax is a fourth. The ability to abstract programming substrate is another. The ability to abstract paradigm is another. And so on. Abstraction is really the possibility of making completely different things work together. It is also the possibility of making things communicate, no matter what they are. If you can abstract on any dimension, this means that you can make your code communicate to any stakeholder. And that is important.

In fact, it is important enough that I’m considering adding another point, such as “Communication over implementation”. Should I?

Higher order functionality over explicitness

The higher you get on the order of things, the more declarative you can be. And the more declarative you can be, the easier it is to communicate intent. The disadvantage is of course that it is hard to judge the implementation behavior of something if you only communicate the intent of code - but I find that argument is false. If the intent is correctly specified, it should be possible to actually have the correct implementation behavior, no matter what.

First class over implicit functionality

My ThoughtWorks colleague Bradford Cross talks about First class oriented programming, and has written many blog entries where this shines through. I totally agree with him. The more things you can make first class in your language, the better. Because those things that aren’t… Well, they will be the conceptual walls of your language. That’s really all there is to it. So making as many things as possible first class will actually expand the borders of your nation. Ehm. Language, that is.

“Right is better” over “Worse is better”

OK, this I couldn’t avoid doing. I like Gabriel’s essay. I just thing that the concept is abhorrent. I guess I’m of the mindset that can’t accept that bad things can thrive. I mean, from a logical standpoint I understand how it works. I just don’t want to be part of it. But from another level, actually releasing something half finished plays into Worse is better. So I’m not sure about this one. I just feel that the right solution is really better than the quick and dirty one.

Language oriented programming over APIs

What does Language Oriented Programming really mean? I’m not sure I can give you a canonical definition, but in my mind it goes back to the tradition of “little languages” in Lisp - which means you mold your environment into the perfect environment for solving your problem. And then you solve your problem in that environment.

I also believe that polyglot programming and external DSLs are an important piece of this puzzle. So all in all, the features in Ioke that enable LOP is the ease of creating internal and external DSLs, and the interoperability of both the JVM and the CLR. These things together make it easier to solve problems in a language oriented fashion.

“Code as data” over “Data as code”

The more I think about this one, the less sense it makes. What I wanted to express was the point of view that data and code aren’t really two different things, and that Ioke should treat them more or less the same - much like the Lisp tradition of symbolic expressions. In Ioke this is realized by having the AST being first class and core to the execution of any code. This AST can be modified in any way, created from scratch and so on, meaning that code and data get blurred together. There is also another discussion under the covers here, dealing with smart and dumb data. What is “code as data”? Does it deal with smart or dumb data? I don’t know.

Homoiconicity over syntax

Ioke plays a lot of cards surrounding homoiconicity - meaning that the structures used to represent code are the same as what you use to code. This allow easy access to lazy evaluation and syntactic macros, and is thus extremely powerful. This is one of the core parts of the expressiveness of Ioke, and it is more important than useful syntax. Which means that when the balance of syntax vs homoiconicity gets skewed, it will be syntax that pays the price.

Syntax over explicit APIs

But if you have actually looked at Ioke, you know that Ioke likes syntax. It doesn’t like it as much as Ruby or Perl, but it likes it much more than Dylan or Lisp. But in the same way Dylan works, Ioke syntax is generally canonical and maps very cleanly down to the message sending paradigm. This makes the AST clean and regular, without having to care about different syntax elements. Take assignment. Doing something like “foo = 42″ will end up with the AST “=(foo, internal:createNumber(42))”. Here, “=” is clearly defined as just a simple message send, just as anything else. And in the same manner, the creation of a literal (the number 42) is also represented as a message send. The AST in Ioke is very regular and simple. It has exactly one node type - the message.

So saying syntax over explicit APIs means that the Ioke way of creating a new list is to use the [] method, not the list method. Because it reads better. I find that judicious syntax make the intent of my code better. The right amount of syntax makes my code more readable, not less. This can be abused, but frankly - I am not in the business of stopping people from doing stupid things. The good thing about doing stupid things is that it tends to remove the person doing it from being productive, sooner or later. It is a self balancing equation.