Should languages be multi-lingual?


I’m currently sitting in the Beijing ThoughtWorks office, and for some reason language is on my mind… =)

One of the discussions related to DDD that have turned up several times the last few months at conferences
is how you handle ubiquitous language when your domain is not in English. Since most programming languages are based on English, you end up mixing English and Swedish for example, if you are working with a Swedish domain. Of course, the benefits of working with these concepts in Swedish are very hard to argue against. But the dichotomy between the programming language and the domain language is definitely something that hurts my eyes, so I’m generally not very fond of that approach.

In fact, I haven’t heard anyone come up with a good solution to this problem, and this post is not really a solution either.

One of the things I’ve proposed to make this situation better is to create an external DSL that is fully in the domain language. The implementation of that DSL can then be implemented in English. The main benefit is that there is a clear separation.between the domain language and the programming language. On the other hand, the overhead of creating the DSL and also the complexities involved in translating the domain concepts into programming language concepts can become problematic too.

One interesting idea in Cucumber is the idea that you can easily add new natural languages to write the features in. When it comes to user stories at the level of testing that Cucumber provides, it’s really important to use the right language. So it got me thinking, could you use the same kind of approach in a general programming language too?

As an experiment I took a small example program for Ioke, and translated it into Mandarin, with simplified Chinese characters. Of course I used Google Translate for this, so the translation is probably not very good, but the end result is still interesting. I’m not going to try to get this into my blog, so take a look at the file at github instead: http://github.com/olabini/ioke/blob/master/examples/chinese/account.ik. As you can see there is nothing in there that even reeks of English. If you don’t understand Chinese characters it is probably hard to see what’s happening here. Basically an Account object is created, with a “transfer” method and a “print” method. Further down, two instances of this Account object is created, some transfers are made, and then the objects are printed. But provided my translation is not too crappy, this code should make sense to someone reading Chinese.

Now, this is actually extremely simple to implement in Ioke, since it relies on several of the features Ioke handles very easily. That everything is a message really helps, and having everything be first class means I can alias methods and things like that without any worry. Obviously your language also need to handle non-ascii identifiers correctly, but that should be standard in this day and age.

When thinking about it, something similar to do this can be created in languages like Lisp, Smalltalk, Factor, Io and Haskell – but most other languages would struggle. If you have keywords in your language, it’s really a killer – you would need to branch your parser to make it happen.

Of course, this approach only works when you can simply translate from one word to another. If the writing system is right to left, or top to bottom, it’s much more tricky to create a good translation.

I’m also not sure if this is actually a really good idea or not. It might be. The other thing I’ve been thinking about is how to handle multilingual editing. What if you want to be able to switch back and forth between languages? How can you handle identifiers with more than one name. Would you want to?

Lots of unanswered questions here. But it’s still funny to think about. Communication is the main goal, as usual.



What is eval?


The glib answer to this question would be: “evil”. Of course, that doesn’t really tell us anything new. I wanted to explore the question of where in the spectrum eval fits in, in dynamic languages, and why the power of the language is ultimately increased by including eval.

Lately I’ve been saying that having eval is actually a roundabout way of having the interpreter be first class. After some thinking I’ve realized that this isn’t strictly true, which is why I wanted to spend some more time on eval.

The history of eval goes back to McCarthy’s paper on Lisp, long before Lisp was actually implemented. The interesting point is that the eval given in that paper can be used by the language itself, and the language can define its own semantics in term of itself, so a complete eval can be implemented in the language itself. This property is generally called a metacircular interpreter. Of course, having eval be this easy to implement in the language itself makes it extremely simple to also tweak it a bit and implement subtly different versions of the language. All of these advantages are not really based on eval itself, though, but rather in the fact that Lisp is so easy to define in terms of itself.

Eval shines more in languages where it’s really hard to define the semantics, like in JavaScript, Ruby or Perl. In these languages it is still possible to implement an eval in the language itself, but it’s extremely hard. In these languages, having eval gives you an escape hatch into the already implemented interpreter that is running the host code.

There are two different versions of eval in common use. Which one is mostly used depends on the type of the language. In homoiconic languages you will generally not give strings to eval, since you can just give the code to execute directly to eval. The typical example of this is Lisp, where eval takes an S-expression. Since it is so easy to build S-expressions (and it’s fundamentally more expressive), this means that this version of eval makes many things easy. Languages that are not homoiconic generally takes a string that contains the code, and will then parse the code and then execute it.

Most versions of eval also take an argument that contains the current binding information, or the current context. In some versions this is implicit and can never be sent in explicitly, while some languages (like Ruby and Lisp) allow you to send in the binding separately. For this to be powerful you obviously need a way to get at the binding in a current context, and then be able to store that somewhere.

So, in summary eval depends on two different capabilities that are more or less orthogonal. The first one is to call out to the interpreter and ask it to execute some code. The second is to be able to manipulate code contexts in a limited manner. Some languages allow you to do whatever you want with contexts, but that is definitely not the norm – since it disallows some very powerful optimization techniques. It is possible to get access to this information without sacrificing performance, though, as Smalltalk shows.

To get back to the question whether eval has anything to do with first class objects, we need to first look at what it actually means to be first class. Of course, the points for being first class depend to a degree on what language we are talking about. The wikipedia definition is that a first class object is something that can be used inside the programming language without restriction, compared to other entities in the language. In the context of an object oriented language, this would mean that you should be able to create new instances of it, you should be able to store it in variables, you should be able to pass it as arguments to methods and return it from methods. You should also be able to call methods on it, and so on.

Depending on how you see it, the eval function is generally pretty restricted in what you can do with it. Specifically, in Ruby, if you do the refactoring Extract Method on a piece of code that includes eval, eval will actually not work the same. This makes eval a fundamentally different method than all other methods in Ruby.

So lets change the question a bit – how can we make the interpreter first class while still retaining the simplicity of eval? The first step is to actually make the interpreter into a class. This class have one instance that is the currently running runtime. Once you have that object available at runtime, the next step is to be able to create new instances of the interpreter, and finally to be able to ask it to invoke code. The second piece of the puzzle is to make bindings/context first class, so you can create new ones at runtime and manipulate them. Once you have those two things together, eval will actually just be a shortcut to getting the current interpreter and the current runtime and ask it to evaluate some code.

Ioke doesn’t have it right now, but I have made place for it. There is an object called Runtime that reflects the current runtime. The plan is to make it possible to call mimic on in, and by doing so create a new interpreter from the current one. What is interesting is that this makes it possible to have some inherent security too. Since the second runtime mimics the first one, the second one won’t have capabilities that the first one lacks.

In Ioke a binding is just a regular Ioke object – nothing special at all really, and you can just create any kind of object and use that as a binding object. The core of simplicity in Ioke makes these operations that much simpler.

Eval is a strange beast, but at the end of the day it is still about accessing the interpreter. Generalizing this makes much more interesting things possible.



Communication over Implementation


Last week I wrote a post about some of the statements that percolate in my mind when designing Ioke. What I didn’t mention was that these ideas are things I use to judge other programming languages too. I would say that this philosophy pretty much captures my views on programming languages. (The post in question is here: The Ioke Philosophy). So, this post is of course not complete, and I don’t think I would ever be able to write something that is totally complete.

One of the things missing – and I did allude to it in the post – was a statement that has grown on me a bit. I did a presentation about Ioke last week, and at that point I decided I needed to talk about this some more. The statement in question is what I call Communication over Implementation. This turns out to be pretty important for programming languages in general, at least in my experience.

One of the things I’m fond of saying when talking about programming languages, is that programming languages – just like natural languages – are about communication. And we don’t necessarily always think clearly about who we are communicating with. The immediate and intuitive reaction to programming languages is that they are supposed to communicate with the compiler/interpreter/cpu. That is of course true, but it is also incidental in many cases. There are many ways in which you can communicate with the machine to get it to achieve something. So the question becomes what other parties should you consider when communicating.

The next most obvious party would be yourself. If you ever need to read your code, you need to write code in such a way that you can read it later on. This constrains the way you write code quite severely. There are reasons we don’t write much code in assembly language or JVM bytecodes anymore. Yes, at some level these descriptions are extremely nice communication towards the executing machine, but they are so bad at communicating with human stakeholders that the balance generally ends up in favor of more readable languages.

When communicating with human stakeholders the thing I focus most on is intent. If your code/text communicates the intent of what you are trying to do in a good way, this makes it easier to read. There are many movements that focus on how to do this well, where domain domain design and clean code are the two that immediately comes to mind for me.

So coming back to the title. For me, implementation is a special sort of communication – that kind of communication that is supposed to be functional and describe what should actually be done in deterministic instructions to a machine. As long as this communication is functional enough – meaning that the machine does more or less the right thing – there is much leeway in how the code can be written to make other kinds of communication easier. And that is the core of this argument. A language should make it easy to communicate with other stakeholders than the machine, since those other forms of communication with code is actually much more important than only the implementation pieces. Yes, if the implementation works your program might run for a while – but if no one can read the code it can’t be maintained, it can’t be understood except in a black box way, and the utility of the system will be limited.

Go the other way. If you have a program that communicates badly with the machine (but still well enough according to the above definition), but it is written in a clearly communicating way, this means that it is easier to grow the system, it is easier to fix it or implement it more correctly. It can also easier be replaced since the program communicates what it is doing.

There are exceptions to this principle. But we seem to to favor languages that are focused on implementation and only incidentally on communication. This is the wrong choice and it need to be fixed. Communication is at the core of programming, and should also be the focus of it.



The Ioke philosophy


I have in various circumstances used a list of statements, a kind of Ioke manifesto, that tries to give the spirit of what kind of guidelines I use when designing Ioke. Some of them are very serious, and some … well, more in jest. I thought I’d expand on them a bit there. In the tradition of these kind of manifestos, Ioke values the thing on the right, but values the thing on the left much more.

Oh, and remember that these ideas… They are really my ideas and thoughts and values. Nothing else. It is not in any way objective. So please don’t take offense, get riled up or start any holy wars.

Expressiveness over performance

This is really the full manifesto of Ioke, if you ever had to choose just one. In any situation where I have to choose between expressiveness or performance, expressiveness is always the answer. The side effect is of course that Ioke is not a fast language at all. But I believe it is one of the most expressive you can find – at least if you measure expressiveness the way I do.

Abstraction over low level interfaces

A special case of expressiveness is abstraction. When I get drunk and rant about programming languages (something that happens all too often) one of the words I use all the time is “abstraction”. As it turns out, being able to abstract things – no matter what they are – is one of the most important things in a programming language – for me. The ability to abstract objects and classes of objects is one of those. The ability to abstract functionality is another. The ability to abstract structure is a third. The ability to abstract syntax is a fourth. The ability to abstract programming substrate is another. The ability to abstract paradigm is another. And so on. Abstraction is really the possibility of making completely different things work together. It is also the possibility of making things communicate, no matter what they are. If you can abstract on any dimension, this means that you can make your code communicate to any stakeholder. And that is important.

In fact, it is important enough that I’m considering adding another point, such as “Communication over implementation”. Should I?

Higher order functionality over explicitness

The higher you get on the order of things, the more declarative you can be. And the more declarative you can be, the easier it is to communicate intent. The disadvantage is of course that it is hard to judge the implementation behavior of something if you only communicate the intent of code – but I find that argument is false. If the intent is correctly specified, it should be possible to actually have the correct implementation behavior, no matter what.

First class over implicit functionality

My ThoughtWorks colleague Bradford Cross talks about First class oriented programming, and has written many blog entries where this shines through. I totally agree with him. The more things you can make first class in your language, the better. Because those things that aren’t… Well, they will be the conceptual walls of your language. That’s really all there is to it. So making as many things as possible first class will actually expand the borders of your nation. Ehm. Language, that is.

“Right is better” over “Worse is better”

OK, this I couldn’t avoid doing. I like Gabriel’s essay. I just thing that the concept is abhorrent. I guess I’m of the mindset that can’t accept that bad things can thrive. I mean, from a logical standpoint I understand how it works. I just don’t want to be part of it. But from another level, actually releasing something half finished plays into Worse is better. So I’m not sure about this one. I just feel that the right solution is really better than the quick and dirty one.

Language oriented programming over APIs

What does Language Oriented Programming really mean? I’m not sure I can give you a canonical definition, but in my mind it goes back to the tradition of “little languages” in Lisp – which means you mold your environment into the perfect environment for solving your problem. And then you solve your problem in that environment.

I also believe that polyglot programming and external DSLs are an important piece of this puzzle. So all in all, the features in Ioke that enable LOP is the ease of creating internal and external DSLs, and the interoperability of both the JVM and the CLR. These things together make it easier to solve problems in a language oriented fashion.

“Code as data” over “Data as code”

The more I think about this one, the less sense it makes. What I wanted to express was the point of view that data and code aren’t really two different things, and that Ioke should treat them more or less the same – much like the Lisp tradition of symbolic expressions. In Ioke this is realized by having the AST being first class and core to the execution of any code. This AST can be modified in any way, created from scratch and so on, meaning that code and data get blurred together. There is also another discussion under the covers here, dealing with smart and dumb data. What is “code as data”? Does it deal with smart or dumb data? I don’t know.

Homoiconicity over syntax

Ioke plays a lot of cards surrounding homoiconicity – meaning that the structures used to represent code are the same as what you use to code. This allow easy access to lazy evaluation and syntactic macros, and is thus extremely powerful. This is one of the core parts of the expressiveness of Ioke, and it is more important than useful syntax. Which means that when the balance of syntax vs homoiconicity gets skewed, it will be syntax that pays the price.

Syntax over explicit APIs

But if you have actually looked at Ioke, you know that Ioke likes syntax. It doesn’t like it as much as Ruby or Perl, but it likes it much more than Dylan or Lisp. But in the same way Dylan works, Ioke syntax is generally canonical and maps very cleanly down to the message sending paradigm. This makes the AST clean and regular, without having to care about different syntax elements. Take assignment. Doing something like “foo = 42” will end up with the AST “=(foo, internal:createNumber(42))”. Here, “=” is clearly defined as just a simple message send, just as anything else. And in the same manner, the creation of a literal (the number 42) is also represented as a message send. The AST in Ioke is very regular and simple. It has exactly one node type – the message.

So saying syntax over explicit APIs means that the Ioke way of creating a new list is to use the [] method, not the list method. Because it reads better. I find that judicious syntax make the intent of my code better. The right amount of syntax makes my code more readable, not less. This can be abused, but frankly – I am not in the business of stopping people from doing stupid things. The good thing about doing stupid things is that it tends to remove the person doing it from being productive, sooner or later. It is a self balancing equation.



The Ioke object hierarchy


When designing the core Ioke libraries, I’ve tried to factor things out into different places, as much as possible. This have given rise to a pretty complicated object hierarchy, so I thought I’d describe it a bit closer here. This should make it a bit easier to understand how lookup of methods work, and where you should add functionality for the right effect. In this blog post I will go through most of the kinds in Ioke, notice how they fit together and also what kind of methods they contain.

OK, so let us start at the top. Even here we run into a small problem, of course – since Ioke is not single inheritance. One object can mimic more than one object, or none. There is no absolute requirement that everything is part of the same inheritance chain. We will get back to this in a while, but at first we can ignore that, and imagine that the hierarchy is simple.

At the top of the Ioke hierarchy is the object called Base. This object has no mimics, and contains only the most essential methods. These are methods that allow the introspection and modification of cells, such as “cell”, “cell=”, “cell?”, “cellNames”, “cells” and so on. It also includes “documentation” and “documentation=”. It includes the assignment method “=”. And finally it contains “mimic” that allow you to create new instances. This method is basically the only way of allocating new objects in Ioke, which means it is pretty important. If you need the functionality but without mimicking Base, you can either copy the method itself, or mimic BaseBehavior (we’ll get back to this in a minute.)

OK, so Base is the absolute base of everything in Ioke. But Base is not the equivalent of the Object found in most other languages – in most cases you won’t get the expected behavior if you mimic directly from Base. The object that is supposed to be used as the origin of new objects is called Origin. But before we go there, we need to talk about Ground.

Ioke actually overloads the term ground to mean two slightly different things. First, there is an object called Ground, and secondly, the ground of an evaluation is basically the default context. The default ground at the top level is the object Ground. Now, Origin mimics Ground. This means that if you add something to the top level, that will be in the inheritance chain of all objects. This is the closest you will get to a global scope in Ioke.

The Ground object mimics different things depending on which implementation you are running. If you’re on ikj, Ground will mimic both IokeGround and JavaGround, where JavaGround is the object where all Java integration machinery resides. On ikc, Ground currently only mimics IokeGround, but I expect to add CLRGround sooner or later. IokeGround is the place where all so called global names in the system is defined. This is where you find the cells for “Text”, “Symbol”, “Number”, “Range” and also “false”, “true” and “nil”. Yes, that is right, these names are not constants – they are just cells on IokeGround.

IokeGround mimics Base – so that is where the connection to the base of the hierarchy actually is made. But IokeGround also mimics and object called DefaultBehavior. As the name implies, this is where most of the Ioke behavior resides. DefaultBehavior in itself is basically a mixin, only providing these methods. You should not mimic DefaultBehavior in itself, if possible.

(Incidentally, that is the difference between an object that is supposed to be a mixin, and a regular object. In Ioke there is no real difference at the structure level – it is only about intent.)

DefaultBehavior doesn’t contain many methods in itself, but instead mixes in several more focused objects, such as “DefaultBehavior Aspects”, “DefaultBehavior Assignment”, “DefaultBehavior Boolean”, “DefaultBehavior Case”, “DefaultBehavior Conditions”, “DefaultBehavior Definitions”, “DefaultBehavior FlowControl”, “DefaultBehavior Internal”, “DefaultBehavior Literals” and “DefaultBehavior Reflection”.

If you want to make the intent of a mixin very clear, you mimic the Mixins object. This object include the bare necessary things to add new cells and introspect on them. “Mixins Comparing” and “Mixins Enumerable” are two examples of such mixins.

And that is basically the full story. Any other object you will find in the system is almost certainly derived from Origin, and the hierarchy under Origin is pretty shallow and flat, at least for now. I’ve considered making the intent of different collections a bit more specific, in the manner of Smalltalk – but this hasn’t happened yet.



Message chains and quoting in Ioke


One of the more advanced features in Ioke is the ability to work with first class messages. At the end of the day, you are manipulating the AST directly by doing this, which means that you can do pretty much anything you want. The manipulation of message chains is the main way of working with macros in Ioke, so understanding what you can do with them is pretty important.

The documentation surrounding these pieces is spread all over the place, so I thought I’d take a look at messages and the way you construct and modify them.

Messages

The first step in working with message chains is to actually understand the Message. Message is the core data structure in Ioke, and it has some native properties that define the full structure of the Ioke AST. There are four pieces of the structure that is central to messages, and a few more that is less interesting. So let us look at the core structure. It is actually extremely simple. These are the things that makes a Message:

  • Name – all messages have a name. From the perspective of Ioke, this is a symbol. It will never be nil, but it can be empty.
  • Arguments – a list of messages, zero or more.
  • Prev – a pointer to the previous message in the chain, or nil if there is no previous.
  • Next – a pointer to the next message in the chain, or nil if there is no next message.

A message can also wrap a value. In that case the message will always return that value, and no real evaluation will happen. This can be used to insert any kind of value into a message chain that will later be evaluated. This is called wrapping.

A message chain is just a collection of messages, linked through their Prev and Next pointers.

The arguments to a message are represented as a list of messages. This make sense if you think about it for a few seconds.

OK, now you know what the Ioke AST looks like. It isn’t harder than that. Now, if you actually want to start working with messages, there are several messages that Message can receive, that allow you to work with them. The simpler ones (that I won’t explain closer) is “name”, “name=”, “arguments”, “arguments=”, “next”, “next=”, “prev”, “prev=”.

There are a few more interesting ones that merit some explanation. First, “last”. This message will just return the last message in the message chain. It is the equivalent of following the next pointer until you come to the end.

It’s important to keep in mind that Message is a mutable structure, which means you need to be careful to not change things that will give you unexpected changes. For example, if someone sends in a message, you shouldn’t generally actually modify that without copying it. Now, if you only want to copy a message without copying recursively the next pointer, you can just mimic it. Otherwise you use the method “deepCopy” which will actually copy both the next pointer and the arguments recursively.

Now, if you want to add new arguments to a message, you can use “appendArgument”. This method is aliased as “<<“. It will also return the receiver, so you can add several arguments by linking calls to appendArgument/<<. If you want to add a message at the beginning of the argument list, you instead use >>.

One of the more annoying things is that once you set the next pointer, you generally need to make sure to set the previous pointer of the next value too, unless you are setting it to nil. The same thing is true when setting the prev pointer. So, in the cases when you want to link two messages, you shouldn’t set these specifically, but instead use the “->” method. This allow you to link two Ioke messages. For example “msg1 -> msg2” will actually set the next pointer on msg1 and the prev pointer on msg2. If you do “msg1 -> nil” it will set the next pointer to nil.

And that’s basically it. If you need to actually evaluate the messages, you can either use “sendTo” or “evaluateOn”. The main difference here is that sendTo will actually not evaluate the message chain. It will only evaluate the message that is the receiver of the call. The evaluateOn method will follow the message chain and evaluate it fully, based on the context arguments given to it.

Oh, one last thing. To create new messages from scratch, there are a few different ways. First of all, you can wrap a value like this: “Message wrap(42)”. That will return a new message that wraps the number 42.

You can create a message chain from a piece of text by doing ‘Message fromText(“one two three”)’. This will return a message chain with three messages, linked together.

Finally, you can create a new message chain by using the from-method. You use it like this: “Message from(one two(three) four)”. What is returned is the message chain that is the argument. If you think about it for a few seconds, you can probably guess how to implement this using an Ioke macro.

Quoting

Now that we understand messages and message chains, let us take a look at how to create new chains in a flexible way.

First of all, all of the above methods are all very useful and nice, but they tend to be a bit verbose. Coming from a Lisp background I felt inclined to put the quoting characters to good use for this. So, first of all, the single quote (‘) does the same thing as “Message from”. The back quote (`) does the same thing as “Message wrap”. So, to wrap the number 42, you can just do `42. In this case you don’t need parenthesis, since the back quote is an operator. To create a new message chain, use the single quote: ‘(foo bar(x) baz).

We almost have everything we need, except that we need some convenient ways of actually putting things into these message chains without having to put them together by hand.

Say for example we have a variable “blah” that contains an unknown message. We want to create a message “one” that is followed by the message in the variable “blah”. And then finally we want to add two messages “bax” and “baz” after it. We could do it like this: x = ‘one. x -> blah. x last -> ‘(bax baz). All in all, that is not too bad, but we can do better. This is done using the splice-quote operator, which is just two single quotes after each other. Using that it would look like this: ”(one `blah bax baz). In this case, the back quote inside of the splice-quote call will actually be evaluated in the current context and then have the result be spliced into the message chain being created. Now, only use the back quote if you are sure you can modify it. If you want to copy blah before inserting it, use the single quote again, instead of the back quote: ”(one ‘blah bax baz)

All in all, this is really all you need, and you can take a look at the core libraries and see how they are used. A typical example is the comprehensions library, and also the destructuring macros. In general, creating these message chains on the fly is the most useful inside of syntax macros.

I am planning to add a new feature to Ioke, that allow you to do tree rewriting for manipulating chains in different ways. This will be a feature built on top of the primitives described here, and these features will continue to be the main way of working with message chains for a long time.



The Ioke reflector


Ioke is a very object oriented language. The amount of concepts in it is also quite small. I’ve tried to factor the implementation into smaller pieces, which are then composed into the full object hierarchy. One of the side effects of this is that it is possible to make objects that can’t be handled generically, since they don’t have all those things you expect objects to have. There is no requirement on Ioke objects to actually have a core set of methods. Things can be totally blank. Of course, a completely blank object is not very useful in itself…

So what kind of things become problematic then? Well, lets first talk about blank slates. In Ioke it’s pretty simple to create one – you just create an object, and then call the method “removeAllMimics!”. What you have after that point is something that is totally empty (unless if you added cells to the object itself before hand).

Now say that you want to add back a mimic to that object. Or say you want to add a new cell. As it happens, “=” is just a method on the Base kind. To add a new mimic, you generally use either “mimic!” or “prependMimic!”. These are both methods on ReflectionBehavior. So none of these methods are available.

Another example is mixins. In Ioke, a Mixin is nothing other than a regular object, except that it doesn’t have Base or DefaultBehavior in its mimic chain. Instead, the root of all mixins is the kind Mixin. Now, Mixin actually defines a copy of “=” and a few other things from Base. But it doesn’t allow you to add new mimics to mixins for example. This is a bit annoying – but since you have access to “=” you can actually fix it, by coping the “mimic!” method from ReflectionBehavior.

There are other circumstances where you want to be able to handle any object whatsoever in a uniform manner, introspect on a few things and so on. That wasn’t possible.

I’ve been considering these problems for a while, and finally decided that the pragmatic way of solving it is to add some methods that can be used to take a part and modify objects from the outside. I call the namespace that gives this functionality Reflector. Now, remember that you shouldn’t use the Reflector unless you’re certain that your objects need to be handled in that specific way. It’s more verbose, and not as object oriented as the methods that exist on regular objects. But it does allow you to do certain things on all objects in the system.

The rules are simple – all methods on Reflector are based on other methods in Base or ReflectionBehavior. They have the same name, except that they all start with “other:”. They all take an initial argument that is the object to work on. So, say we have an object “bs” that is a blank slate. We want to add a cell to it. We can do that simply like this:

Reflector other:cell(bs, :foo) = 42

Here we set the cell “foo” on the blank slated object to the value 42.

The problem of adding a new mimic to a mixin is easily solved too. Say we have an object ExtraEnum that we want to mix in to Enumerable. We can do it like this:

Reflector other:mimic!(Mixins Enumerable, ExtraEnum)

And so on. You can see all the methods available on Reflector in the doks.

As mentioned, the Reflector is a pretty specialized tool, but it’s totally indispensable when you need it. I will pretty soon sit down and rewrite DokGen to use it, since DokGen is a typical example of something that really need to be able to handle all objects in a generic way.



Languages should die


One of those interesting facts about evolution is that most of the species that ever existed are now extinct. There are good reasons for this. The average life time for a mammalian species is a few million years. There are currently about 5400 species of mammals alive right now, and the mammalian class has existed for about 300 million years. If the current number is representative, there has been about 810 000 different species of mammals during the history of that class. So less than 1% are alive today. And this is only talking about mammals.

Many programming languages have died out, but there are still several old programming languages making a living. Cobol is still the backbone of much infrastructure in the world – it is even evolving still (Cobol 2002 include support for object orientation). Making another analogy with evolution, Cobol would be the sharks of programming languages. They have been around for a long, long time – at least 400 million years in the case of sharks, and 50 for Cobol – and are still recognizable. Fortran is another language that has been used for a long time, and has evolved substantially.

But we also have newer languages. From the view of business and technology, Java and other C-based languages are way ahead. In the dynamic realm, we have Perl, Python and Ruby. All of them over 14 years old. And then we have the really new languages, like Scala and Clojure, who are rapidly gaining use.

Maybe it is a bad thing that these languages that we use day to day have been so long lived. I’m not saying they should die out totally – I know that will never happen. But it might have been better if we had spent energy on fixing the Java language, instead of creating more and more tools to fix deficiencies in the language. The good part about Java is the JVM, and that could have been at least as good now even if we had evolved the language more.

It’s obvious that humanity isn’t evolving like other animals. Natural selection is still happening, but the results are getting skewed by better medicine, social infrastructure and many other inventions. These are tools that make it better for us – with the side effect that natural evolution is changing course. Something similar seems to have happened with Java – we have created so many tools to fix Javas problems, that there isn’t enough pressure to fix the language. I tend to believe that this is a good strategy for humanity, but a bad strategy for language development.

I’m happy that more and more new languages are gaining play on the JVM, on LLVM, the CLR and Parrot. That’s great. But on the other hand I’m seeing threads on Ruby-talk asking whether Ruby can stay ahead. Why would we want that? Wouldn’t it be better to take the best pieces of Ruby, and build on that? In the keynote at the last RubyConf, Dave Thomas proposed that people fork the language. I think that would be a great thing if it started to happen more.

I guess I’m sounding a bit like “yeah, all the current effort is good, but it could be better!”. And it could be. I think the way most people look at languages are all wrong.

Michael Feathers wrote a post on Artima called Stunting a Framework. Maybe we should look at language development in that way? Is this another way to approach DSLs, or can we get more useful, smaller languages, by using such an approach? I think it would be very interesting, no matter what.

Languages should die and be replaced. There should be less cost involved in developing a language. Languages should work on common infrastructure that makes this easy – and we’re finally hitting that mark with machines such as the JVM, LLVM, CLR and Parrot.

So – you should go out and fork your favorite language. Or your least favorite language. Fix the things that you don’t like with it. Modify it for a specific domain. Remove some of the cruft. Create a new small language. It doesn’t have to be big. It doesn’t have to do everything – one of the really cool things about functional languages is that they are often have a very small core. Use a similar approach. Hack around. See what happens.

No existing language will die in a long time. But we need new ideas. New fresh ways of looking at things. And new languages that incorporate what other languages do right. That’s how we will finally get rid of the things we don’t like in our current languages.

You should be ready to abandon your favorite language for something better.



Static type thinking in dynamically typed languages


A few days back I said something on Twitter that caused some discussion. I thought I’d spend some time explaining a bit more what I meant here. The originating tweet came from Debasish Ghosh, who wrote this:

“greenspunning typechecking into ruby code” .. isn’t that what u expect when u implement a big project in a dynamically typed language ?

My answer was:

@debasishg really don’t agree about that. if you handle a dynamically typed project correctly, you will not end up greenspunning types.

Lets take this from the beginning. The whole point of duck typing as a philosophy when writing Ruby code is that you shouldn’t care about the actual class of an object you get as argument. You should expect the operations you need, to actually be there, and assume they are. That means anyone can send in any kind of object as long as it fulfills the expected protocol.

One of the problems in these kind of discussions is that people conflate classes with types in dynamic languages. In well written Ruby code you will usually end up with a type for every argument – that type is a number of constraints and protocols that will wary depending on what you do with the objects. But the point is that it generally will make things more complicated to equate classes with these types, and you will design classes without any real purpose. Since you don’t have static checking, you don’t need to have overarching classes that act as type specifiers. The types will instead be implied in the contract of a method.

So far so good. Should you keep this in mind when designing a method? In most cases, no. I tend to believe that you will end up conflating classes and types. That’s what I’ve seen on several projects at least. The first warning sign is generally kind_of? checks. Of course, you can do things this way, but you will restrict quite a lot of the power of dynamic typing by doing this. One of the key benefits of a dynamically typed language is the added flexibility. If you end up greenspunning a type system, you have just negated a large part of the benefit of your language.

The types in a well defined system will be implicitly defined by what the method actually does – and specified by the tests for that method. But if you try to design explicit types, you will end up writing a static type system into your tests, which is not the best way to develop in these languages.



Ioke E released


After a slightly longer development cycle then the last time, I am finally happy to announce that Ioke E has been released.

Ioke is a language that is designed to be as expressive as possible. It is a dynamic language targeted at the Java Virtual Machine. It’s been designed from scratch to be a highly flexible general purpose language. It is a prototype-based programming language that is inspired by Io, Smalltalk, Lisp and Ruby.

Homepage: http://ioke.org
Download: http://ioke.org/download.html
Programming guide: http://ioke.org/wiki/index.php/Guide
Wiki: http://ioke.org/wiki

Ioke E is the third release of Ioke. It includes many new features from Ioke S, among them full support for Java integration (including implementing interfaces and extending classes), many new methods in the core libraries, IOpt, TextScanner, BlankSlate, better support for regexp interpolation and escapes in regexps, support for Unicode analysis and much more.

Ioke E also includes a large amount of bug fixes.

Features:

  • Expressiveness first
  • Strong, dynamic typing
  • Prototype based object orientation
  • Homoiconic language
  • Simple syntax
  • Powerful macro facilities
  • Condition system
  • Aspects
  • Java integration
  • Developed using TDD
  • Documentation system that combines documentation with specs
  • Wedded to the JVM

The many things added in Ioke E could not have been done without the support of several contributors. I would like to call out and thank:
T W <twellman@gmail.com>
Sam Aaron <samaaron@gmail.com>
Carlos Villela <cv@lixo.org>
Brian Guthrie <btguthrie@gmail.com>
Martin Elwin <elvvin@gmail.com>
Felipe Rodrigues de Almeida <felipero@gmail.com>
Wes Oldenbeuving <narnach@gmail.com>
Martin Dobmeier <martin.dobmeier@googlemail.com>
Victor Hugo Borja <vic.borja@gmail.com>
Bragi Ragnarson <bragi@ragnarson.com>
Petrik de Heus