Ioke E released


After a slightly longer development cycle then the last time, I am finally happy to announce that Ioke E has been released.

Ioke is a language that is designed to be as expressive as possible. It is a dynamic language targeted at the Java Virtual Machine. It’s been designed from scratch to be a highly flexible general purpose language. It is a prototype-based programming language that is inspired by Io, Smalltalk, Lisp and Ruby.

Homepage: http://ioke.org
Download: http://ioke.org/download.html
Programming guide: http://ioke.org/wiki/index.php/Guide
Wiki: http://ioke.org/wiki

Ioke E is the third release of Ioke. It includes many new features from Ioke S, among them full support for Java integration (including implementing interfaces and extending classes), many new methods in the core libraries, IOpt, TextScanner, BlankSlate, better support for regexp interpolation and escapes in regexps, support for Unicode analysis and much more.

Ioke E also includes a large amount of bug fixes.

Features:

  • Expressiveness first
  • Strong, dynamic typing
  • Prototype based object orientation
  • Homoiconic language
  • Simple syntax
  • Powerful macro facilities
  • Condition system
  • Aspects
  • Java integration
  • Developed using TDD
  • Documentation system that combines documentation with specs
  • Wedded to the JVM

The many things added in Ioke E could not have been done without the support of several contributors. I would like to call out and thank:
T W <twellman@gmail.com>
Sam Aaron <samaaron@gmail.com>
Carlos Villela <cv@lixo.org>
Brian Guthrie <btguthrie@gmail.com>
Martin Elwin <elvvin@gmail.com>
Felipe Rodrigues de Almeida <felipero@gmail.com>
Wes Oldenbeuving <narnach@gmail.com>
Martin Dobmeier <martin.dobmeier@googlemail.com>
Victor Hugo Borja <vic.borja@gmail.com>
Bragi Ragnarson <bragi@ragnarson.com>
Petrik de Heus



Laziness in Ioke


Since all the rage in blog posts at the moment is laziness, I thought I’d take a look at how something can be lazy in Ioke. There are several ways of doing this, but most of them are really easy. I haven’t decided exactly how the implementation in core should look like, but it would probably be something like the following.

Basically, for laziness in a dynamic language you want something that lazily evaluates itself when requested the first time, and after that always return that value. Something like this:

foo = Origin mimic
foo bar = lazy("laziness has run!" println. 42)
"helo" println
foo bar println
foo bar println

Here we create a lazy values that returns 42, and prints something to prove what is happening. The lazy-method contains the magic. Any code could be submitted here. In this case the code can lexically close over variables outside, if wanted.

How would we implement “lazy” then? Something like this would suffice:

DefaultBehavior lazy = dmacro(
  [code]
  laziness = fnx(
    result = code evaluateOn(call ground)
    cell(:laziness) become!(result)
    result
  )
)

This code creates a lexical block and returns it. That block is a closure around the argument code. When called, it will evaluate that code, and then change itself to become that result value. Most of the magic here is really in the “become!” operation. And that is how simple it is. Since Ioke tries to make things representation independent, this implementation of lazy works in basically all cases.



Video of my geek night “Creating a language on the JVM” presentation


The day before QCon 10 days ago, I gave a presentation at the ThoughtWorks geek night, about creating languages for the JVM. It ended up being two hours long, and is now available from Skillsmatter, here: http://skillsmatter.com/podcast/java-jee/language-on-jvm.



IKanServe - an Ioke web framework


The last few days I’ve cobbled together IKanServe, which could be called a web framework, if you’re really generous about the meaning of the words. It doesn’t really do that much right now. But it does allow you to serve dynamic web pages using Ioke.

Much of the hard parts in the Ioke servlet was totally cribbed from Nick Sieger’s JRuby-rack, which is a very well-engineered project.

IKanServe is basically three different parts. First and most importantly, it’s an IokeServlet, that takes care of starting runtimes and pooling them. The second part is the Ioke parts of dispatching, that will turn headers and parameters into a dict, and then find matching actions and dispatch them. The third part is a very small builder library that lets you create HTML with Ioke code. In fact, the code for the html builder could be quite interesting, so let’s start with that. It’s very small, promise!

use("blank_slate")
IKanServe HtmlBuilder = BlankSlate create(fn(bs,
    bs pass = method(+args, +:attrs,
      args "<%s%:[ %s=\"%s\"%]>%[%s%]</%s>\n" format(
        currentMessage name, attrs, args, currentMessage name))))

As you can see, we use a blank_slate library, that is part of the standard library of Ioke. If you’re familiar with Ruby, you know what a blank slate is. Ioke’s blank slate is a bit more blank than the Ruby equivalent, though. To create a blank slate, we call the create method, and then send in a lexical block where we can set up anything we want to have on the blank slate. After create is finished, it will be impossible to add new stuff to the blank slate, so this is the only place we can do that. What we do here is to add a “pass” method. Pass is mostly the same as method_missing in Ruby. That method will create a tag from the method name, add all keyword arguments as attributes, then add all the content inside the tag and finally close the tag. The funky looking syntax in the Text literal is mostly the same as the % or sprintf methods in Ruby.

There is some subtle things going on in this method, but to go into them would derail this whole post. Before abandoning the HTML stuff, you might want to know how to use it. This is how:

h = IKanServe HtmlBuilder
title = "I kan serve HTML!"
h html(
  h head(h title(title)),
  h body(
    h h1(
      style: "font-size: 1.5em",
      "Isn't this cool?"
    )
  )
)

HtmlBuilder doesn’t have any state, so you just use it immediately. I alias it as “h” to make it easier on the eyes. That means that every call to the “h” object will return html. That’s how I can differentiate between creating the tag “title” and referring to the variable “title”. This is very easy stuff, but it seems to work well for smaller cases.

OK. Let’s get back to the rest of IKanServe. What do you need to use it? In your WEB-INF, you need to have three files in the “lib” directory - namely “ioke.jar”, “ioke-lib.jar” and “ikanserve.jar”. Since Ioke is inbetween releases you will have to build these yourselves, but it’s pretty simple. You can build the first two using the target “jar-lib” in the Ioke source. The third can be created by checking out ikanserve from Github: http://github.com/olabini/ikanserve, and then building it with ant.

Once you have those in your WEB-INF, you also need a web.xml file. It should look something like this:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE web-app PUBLIC
  "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
  "http://java.sun.com/dtd/web-app_2_3.dtd">
<web-app>
  <context-param>
    <param-name>ioke.max.runtimes</param-name>
    <param-value>2</param-value>
  </context-param>

  <context-param>
    <param-name>ioke.min.runtimes</param-name>
    <param-value>1</param-value>
  </context-param>

  <servlet>
    <servlet-name>IokeServlet</servlet-name>
    <servlet-class>ioke.lang.ext.ikanserve.IokeServlet</servlet-class>
    <load-on-startup/>
  </servlet>

  <servlet-mapping>
    <servlet-name>IokeServlet</servlet-name>
    <url-pattern>/*</url-pattern>
  </servlet-mapping>

  <listener>
    <listener-class>ioke.lang.ext.ikanserve.IokeServletContextListener</listener-class>
  </listener>
</web-app>

Nothing spectacular, just set up the number of runtimes, the servlet and a servlet context listener.

And that’s it. Now you can build a war file and it will be ready to serve. Of course, if you deploy it and try to go to any URL, you will get a message that says: “Couldn’t find any action corresponding to your request. Sorry!”. That’s the default 404 action.

To actually create some code that does something, you need to create a file called iks_application.ik. This file should be places in WEB-INF/classes. Any other Ioke files you want to use should be on the classpath, and you can just use “use” to load those files. What can you do in IKanServe then? Well, basically you can bind actions to urls. Right now there is only one way to bind urls, and that is with regular expressions. A typical small action can be bound like this:

IKanServe actionForPath(#r[^/bar],
  method("you requested this: #{request pathInfo}"))

Here we bind any URL that starts with /bar to the method sent in. At the moment you have access to a request object, and the return value of the method will be rendered as the result. And that’s basically it. As mentioned above, it’s not so much a framework as the initial skeleton for one - but it still works quite well.

If you want to try this out in the easiest way possible, you can do “ant example-app” in the ikanserve source tree, then do “cd jetty” and finally do “java -jar start.jar”. This will build and deploy an example application and then start a Jetty instance to serve it. Go to localhost:8080 and try /foo, /bar or any other URL, and see it in action. You can see the code for this small app in web/WEB-INF/classes/iks_application.ik.

And that’s all for this time. Have fun.



Bottom Types in Dynamic Languages


On the Friday of QCon, Sir Tony Hoare held a presentation about the null reference. This was interesting stuff, and it tied into something I’ve been thinking about for a while. Namely, the similarities and differences in the role of null/nil in static and dynamic languages.

At the surface, a null reference in a statically typed language seems to be a value of a bottom type. Since null is still a value, that cannot be true - since according to the formal definition, a bottom type can have no real value.

To take a typical example of where a bottom type is needed, take a look at what happens in Scala when you have a method that always throws an exception. Of course, that method can never return - so what is the type of the return value? That type is Nothing, the bottom type in Scala. There are no values of type Nothing in Scala, and can never be. The Nothing type is really useful in other circumstances too. Lists in Scala correspond to functional linked lists, where each cons cell is immutable. Every cons cell has a type - this type is the most generic type of the value the cons cell points to, and the type of cons cell it points to. Recursively, this means that the full list will have a generic type that is the least generic type that can encompass every element of the list. But how does this really work for the end element? Every list needs to have a final element that stops the list. This end element generally doesn’t contain a value. So what is the type of that end element? List[Nothing]. The reason being that Nothing can be viewed as the subtype of every other type in the system, which means it is always more specific than any other type. But when you create a new cons cell that points to this end value, the full list will always get the type of the new cons cell.

That explains bottom types, but we are no closer to understanding null references. In a statically typed language, a null reference acts like a value of a bottom type. The reason is that you can assign null to any other type - and this is generally only true for types that are subtypes of the place where it is being assigned to. What is interesting about null in C-like languages, is that there is no type that actually corresponds to null. In languages with null references like this, it seems that the domain of every other type in the system is extended to include null as a value. I haven’t found any reference to how this works from a type checking perspective, but it does look like the bottom type - except that there is no type associated to it. So from now on I will just call null a bottom value. One of the interesting aspects of including a bottom value in your language is that it opens up for runtime errors. If you don’t check every place that uses a reference for the bottom value before using it, you have the potential to get a runtime error.

The reason I started thinking about this was actually to contrast the way null works in a statically typed language, and how it works in a dynamically typed language. But the points above make it more and more obvious that null is actually a runtime feature even in static languages. The type system bypass nulls to allow more dynamism at runtime. But take a look at a dynamically typed language. Of course you won’t have any bottom types, since you don’t have a type system that need to check types. Another feature of most dynamically typed languages is that everything is an expression. In these languages, nil is usually used as a sentinel value to mark that the operation didn’t actually result in any real value. The semantics of how nil is interpreted can depend on both the language and the specific API of the operation in question.

The crucial difference in how nil is handled in a dynamically typed language, compared with how a null reference in a statically typed language works, is one of high level approach. In particular, in a dynamically typed language you will not know what kind of value you get back from an operation. In Ruby you might get back something that is of an expected class, but you can also get something that just looks like that class. Or something totally different, like a recorder object for example. All of these things make nils in dynamically typed languages much less of a problem, since the way you develop in these languages tend to be quite different. In contrast, a null reference in Java is a loop hole, that basically means that anything can return either null or something of the type specified. It’s not exactly a lie, but it is an implicit effect that generally comes as a surprise in a statically typed language.

There is a proposal to add a @NotNull annotation to Java, so you can mark references that shouldn’t allow null. This is not good enough, and will probably not help much at this stage. The reason is that you will be penalized for avoiding null - not the other way around. It also makes code that tries to be correct _more_ verbose, instead of the other way around. Because of backwards compatibility, there is really nothing to do about this though. But I recommend language designers to think carefully before adding a Java-style reference.

Some people asks what the alternative is. What stronger type systems generally allow you to do is have an Option (or Maybe) type, that is genericized on the expected output value.

Say you define a method in Java called get() that will either return a String or a null. If you instead had an Option class in Java, you could define the method as “Option<String> get()”. What is neat about Option is that it’s abstract and it has got two concrete subtypes. The first one is Some, the second is None. This makes it explicit when you can expect a return value to be not be there, and at the same time forces the user of the method to actively handle this case. The default case is references that will always be valid. This a good way to handle this problem in a type system.



The ThoughtWorks Geek Night


Last Tuesday ThoughtWorks hosted a Geek Night, where I was the headline. My presentation was about Creating languages for the JVM, which as it turns out is a quite broad topic. The London office do these kind of geek nights now and again, and this time we ended up being about full capacity for what the office could hold, which is about 50-60 people. It was really crowded, actually. That is a good problem to have, though. I am glad so many turned up to see me do a two hour rambling presentation that ended up being like a non-stop fire hose of information.

I’m not going to try to summarize the actual presentation. There were way to much information in it for that. Luckily, SkillsMatter had a film camera there, and I’ve been told the video should be up anytime. Once that happens I will post about it here. In the meantime, the slides can be found here: http://olabini.com/presentations/CreatingALanguageForTheJVM.pdf.

It was an interesting subject to present about though. I think this material could easily be a one-day tutorial. I wonder if anyone would be interesting in having such a tutorial at their conference? And whether anyone would show up for such a subject…

It was fun, though. I had a great time, and I hope the people who showed up didn’t end up being disappointed with the material.



Ioke at Amsterdam.rb


Monday evening I’ll be in Amsterdam, talking about Ioke to Amsterdam.rb, inbetween some other presentations. Seeing as this is going to be the first public presentation about Ioke, I’m absolutely thrilled. It’s going to be great fun. I still haven’t decided exactly how to present Ioke, but I think it’s going to be eased by this being a Ruby crowd.

As far as I understand, the evening is actually fully booked, so there is no more space for people to come. But hopefully everyone in Amsterdam who is interested in Ioke (all three of you? =) will be there. The presentation will be at TTY’s offices in Amsterdam, from 19. It seems my presentation will be from 20:35.

Hope to see you there. This might be an historic event! =)



Ioke S released


Exactly one month after the first release of Ioke, I am very happy to announce that Ioke S has been released. It has been a team effort and I am immensely pleased with the result of it.

Ioke is a language that is designed to be as expressive as possible. It is a dynamic language targeted at the Java Virtual Machine. It’s been designed from scratch to be a highly flexible general purpose language. It is a prototype-based programming language that is inspired by Io, Smalltalk, Lisp and Ruby.

Homepage: http://ioke.org
Download: http://ioke.org/download.html
Programming guide: http://ioke.org/guide.html

Ioke S is the second release of Ioke. It includes a large amount of new features compared to Ioke 0. Among the most important are syntactic macros, full regular expression support, for comprehensions, aspects, cond and case, destructuring macros, and many more things.

Ioke S also includes a large amount of bug fixes, and several example programs.

Features:

  • Expressiveness first
  • Strong, dynamic typing
  • Prototype based object orientation
  • Homoiconic language
  • Simple syntax
  • Powerful macro facilities
  • Condition system
  • Aspects
  • Developed using TDD
  • Documentation system that combines documentation with specs
  • Wedded to the JVM

The many things added in Ioke S could not have been done without the support of several new contributors. I would like to call out and thank:
T W <twellman@gmail.com>
Sam Aaron <samaaron@gmail.com>
Carlos Villela <cv@lixo.org>
Brian Guthrie <btguthrie@gmail.com>
Martin Elwin <elvvin@gmail.com>
Felipe Rodrigues de Almeida <felipero@gmail.com>



Guide to Ioke development environment


Wonderful! Martin Elwin posted the first part of a series on setting up an Ioke development environment. It’s fantastic to see people blog about Ioke, and Martin wrote some other pieces too. Especially the JSON parser is cool. You can see the development environment post here.



An Ioke spelling corrector


A while back Peter Norvig (of AI, Lisp and Google fame) published a small entry on how spelling correctors work. He included some code in Python to illustrate the concept, and this code have ended up being a very often used example of a programming language.

It would be ridiculous to suggest that I generally like to follow tradition, but in this case I think I will. This post will take a look at the version implemented in Ioke, and use that to highlight some of the interesting aspects of Ioke. The code itself is quite simple, and doesn’t use any of the object oriented features of Ioke. Neither does it use macros, although some of the features used is based on macros, so I will get to explain a bit of that.

For those that haven’t seen my Ioke posts before, you can find out more at http://ioke.org. The features used in this example is not yet released, so to follow along you’ll have to download and build yourself. Ioke S should be out in about 1-2 weeks though, and at that point this blog post will describe released software.

First, for reference, here is Norvig’s original corrector. You can also find his corpora there: http://norvig.com/spell-correct.html. Go read it! It’s a very good article.

This code is available in the Ioke repository in examples/spelling/spelling.ik. I’m adding a few line breaks here to make it fit the blog width - expect for that everything is the same.

Lets begin with the method “words”:

words = method(text,
  #/[a-z]+/ allMatches(text lower))

This method takes a text argument, then calls the method “lower” on it. This method will return a new text that is the original text converted to lower case. A regular expression that matches one or more alphabetical characters are used, and the method allMatches called on it. This method will return a list of texts of all the places matches in the text.

The next method is called “train”:

train = method(features,
  features fold({} withDefault(1), model, f,
    model[f] ++
    model))

The argument “features” should be a list of texts. It then calls fold on this list (you might know fold as reduce or inject. those would have been fine too.) The first argument to fold is the start value. This should be a dict, with a default value of 1. The second argument is the name that will be used to refer to the sum, and the third argument is the name to use for the feature. Finally, the last argument is the actual code to execute. This code just uses the feature (which is a text), indexes into the dict and increments the number there. It then returns the dict, since that will be the model argument the next iteration.

The next piece of code uses the former methods:

NWORDS = train(words(FileSystem readFully("small.txt")))

alphabet = "abcdefghijklmnopqrstuvwxyz" chars

As you can see we define a variable called NWORDS, that contains the result of first reading a text file, then extracting all the words from that text, and finally using that to train on. The next assignment gets a list of all the characters (as text) by calling “chars” on a text. I could have just written ["a", "b", "c", ...] etc, but I’m a bit lazy.

OK, now we come to the meat of the code. For an explanation of why this code does what it does, refer to Norvig’s article:

edits1 = method(word,
  s = for(i <- 0..(word length + 1),
    [word[0...i], word[i..-1]])

  set(
    *for(ab <- s,
      ab[0] + ab[1][1..-1]), ;deletes
    *for(ab <- s[0..-2],
      ab[0] + ab[1][1..1] + ab[1][0..0] + ab[1][2..-1]), ;transposes
    *for(ab <- s, c <- alphabet,
      ab[0] + c + ab[1][1..-1]), ;replaces
    *for(ab <- s, c <- alphabet,
      ab[0] + c + ab[1]))) ;inserts

The mechanics of it is this. We create a method assigned to the name edits1. This method takes one argument called “word”. We then create a local variable called “s”. This contains the result of executing a for comprehension. There are several things going on here. The first part of the comprehension gives a generator (that’s the part with the <-). The thing on the right is what to iterate over, and the thing on the left is the name to give it on each iteration. Basically, this comprehensions goes from 0 to the length of the word plus 1. (The dots denote an inclusive Range). The second argument to “for” is what to actually return. In this case we create a new array with two elements. The three dots creates an exclusive Range. Ending a Range in -1 means that it will extract the text to the end of it.

The rest of the code in this method is four different comprehensions. The result of these comprehensions are splatted, or spread out as arguments to the “set” method. The * is symbol to splat things. Basically, it means that instead of four lists, set will get all the elements of all the lists as separate arguments. Finally, set will create a set from these arguments and return that.

Whew. That was a mouthful. The next method is easy in comparison. More of the same, really:

knownEdits2 = method(word,
  for:set(e1 <- edits1(word),
    e2 <- edits1(e1),
    NWORDS key?(e2),
    e2))

Here we use another variation of a comprehension, namely a set comprehension. A regular comprehension returns a list. A set comprehension returns a set instead. This comprehension will only return words that are available as keys in NWORDS.

known = method(words,
  for:set(w <- words,
    NWORDS key?(w), w))

This method uses a set comprehension to find all words in “words” that are keys in NWORDS. As this point you might wonder what a comprehension actually is. And it’s quite easy. Basically, a comprehension is a piece of nice syntax around a combination of calls to “filter”, “map” and “flatMap”. In the case of a set comprehension, the calls go to “filter”, “map:set” and “flatMap:set” instead. The whole implementation of comprehensions are available in Ioke, in the file called src/builtin/F10_comprehensions.ik. Beware though, it uses some fairly advanced macro magic.

OK, back to spelling. Let’s look at the last method:

correct = method(word,
  candidates = known([word]) ifEmpty(
    known(edits1(word)) ifEmpty(
      knownEdits2(word) ifEmpty(
        [word])))
  candidates max(x, NWORDS[x]))

The correct method takes a word to correct and returns the best possible match. It first tries to see if the word is already known. If the result of that is empty, it tries to see if any edits of the word is known, and if that doesn’t work, if any edits of edits are known. Finally it just returns the original word. If more than one candidate spelling is known, the max method is used to determine which one was featured the most on the corpus.

The ifEmpty macro is a bit interesting. What it does is pretty simple, and you could have written it yourself. But it just happens to be part of the core library. The implementation looks like this:

List ifEmpty = dmacro(
  "if this list is empty, returns the result of evaluating the argument, otherwise returns the list",

  [then]
  if(empty?,
    call argAt(0),
    self))

This is a dmacro, which basically just means that the handling of arguments are taken care of. The argument for a destructured macro can be seen in the square brackets. Using a dmacro instead of just a raw macro means that we will get a good error message if the wrong number of arguments are provided. The implementation checks if its list is empty. If it is it returns the value of the first argument, otherwise it doesn’t evaluate anything and returns itself.

So, you have now seen a 16 line spelling corrector in Ioke (it’s 16 without the extra line breaks I added for the blog).