Macro types in Ioke – or: what is a dmacro?


With the release of Ioke 0, things regarding types of code were pretty simple. At that point Ioke had DefaultMethod, LexicalBlock and DefaultMacro. (That’s not counting the raw message chains of course). But since then I’ve seen fit to add several new types of macros to Ioke. All of these have their reason for existing, and I thought I would try to explain those reasons a bit here.

But first I need to explain what DefaultMacro is. Generally speaking, when you send the message “macro” in Ioke, you will get back an instance of DefaultMacro. A DefaultMacro is executed at runtime, just like regular methods, and in the same namespace. So a macro has a receiver, just as a method. In fact, the main difference between macros and methods are that you can’t define arguments for a macro. And when a message activates a macro, the arguments sent to that message will not be evaluated. Instead, the macro gets access to a cell called “call”. This cell is a mimic of the kind Call.

What can you do with a Call then? Well, you can get access to the unevaluated arguments. The easiest way to do this is by doing “call arguments”. That returns a list of messages. A Call also contains the message sent to activate it. This can be accessed with “call message”. Call contains a reference to the ground in which the message was sent. This is accessed with “call ground”, and is necessary to be able to evaluate arguments correctly. Finally, there are some convenience methods that allow the macro to evaluate arguments. Doing “call argAt(2)” will evaluate the third argument and return it. This is a short form for the equivalent “call arguments[2] evaluateOn(call ground, call ground)”.

This is all well and good. Macros allow you to do most things you would want to do, really. But they are quite rough to work with in their raw form. There are also plumbing that is a bit inconvenient. One common thing that you might want to do is to transform the argument messages without evaluating them, return those messages and have them be inserted instead of the current macro. You can do this directly, but it is as mentioned above a bit inconvenient. So I added DefaultSyntax. You define a DefaultSyntax with a message called “syntax”. The first time a syntax is activated, it will run, take the result of itself and replace itself with that result, and then execute that result. The next time that piece of code is found, the syntax will not execute, instead the result of the first invocation will be there. This is the feature that lies behind for comprehensions. To make this a bit more concrete, lets create a very simplified version of it. This version is fixed to take three arguments, an argument name, an enumerable to iterate over, and an expression for how to map the output value. Basically, a different way of calling “map”. A case like this is good, because we have all the information necessary to transform it, instead of evaluating it directly.

An example use case could look like this:

myfor(x, 1..5, x*2) ; returns [2,4,6,8,10]

Here myfor will return the code to double the the elements in the range, and then execute that.

The syntax definition to make this possible looks like this:

myfor = syntax(
  "takes a name, an enumerable, and a transforming expression
and returns the result of transforming each entry in the
expression, with the current value of the enumerable
bound to the name given as the first argument",

  argName = call arguments[0]
  enumerable = call arguments[1]
  argCode = call arguments[2]
  ''(`enumerable map(`argName, `argCode))
)

As you can see, I’ve provided a documentation text. This is available at runtime.

Syntactic macros also have access to “call”, just like regular macros. Here we use it to assign three variables. These variables get the messages, not the result of those things. Finally, a metaquote is used. A metaquote takes its content and returns the message chain inside of it, except that anywhere a ` is encountered, the message at that point will be evaluated and spliced into the message chain at that point. The result will be to transform “myfor(x, 1..5, x*2)” into “1..5 map(x, x*2)”.

As might be visible, the handling of arguments is kinda impractical here. There are two problems with it, really. First, it’s really verbose. Second, it doesn’t check for too many or too few arguments. Doing these things would complicate the code, at the expense of readability. And regular macros have exactly the same problem. That’s why I implemented the d-family of destructuring macros. The current versions of this are dmacro, dsyntax, dlecro and dlecrox. They all work the same, except the generate macros, syntax, lecros or lecroxes, depending on which version used.

Let’s take the previous example and show how it would look like with dsyntax:

myfor = dsyntax(
  "takes a name, an enumerable, and a transforming expression
and returns the result of transforming each entry in the
expression, with the current value of the enumerable
bound to the name given as the first argument",

  [argName, enumerable, argCode]

  ''(`enumerable map(`argName, `argCode))
)

The only difference here is that we use dsyntax instead of syntax. The usage of “call arguments[n]” is gone, and is instead replaced with a list of names. Under the covers, dsyntax will make sure the right number of arguments are sent and an error message provided otherwise. After it has ensured the right number of arguments, it will also assign the names in the list to their corresponding argument. This process is highly flexible and you can choose to evaluate some messages and some not. You can also collect messages into a list of messages.

But the real nice thing with dsyntax is that it allows several choices of argument lists. Say we wanted to provide the option of giving either 3 or 4 arguments, where the expansion looks the same for 3 arguments, but if 4 arguments are provided, the third one will be interpreted as a condition. In other words, to be able to do this:

myfor(x, 1..5, x*2) ; returns [2,4,6,8,10]
myfor(x, 1..5, x<4, x*2) ; returns [2,4,6]

Here a condition is used in the comprehension to filter out some elements. Just as with the original, this code transforms into an obvious application of “filter” followed by “map”. The updated version of the syntax looks like this:

myfor = dsyntax(
  "takes a name, an enumerable, and a transforming expression
and returns the result of transforming each entry in the
expression, with the current value of the enumerable
bound to the name given as the first argument",

  [argName, enumerable, argCode]

  ''(`enumerable map(`argName, `argCode)),

  [argName, enumerable, condition, argCode]

  ''(`enumerable filter(`argName, `condition) map(`argName, `argCode))
)

The only thing added is a new destructuring pattern that matches the new case and in that situation returns code that includes a call to filter.

The destructuring macros have more features than these, but this is the skinny on why they are useful. In fact, I’ve used a combination of syntax and dmacro to remove a lot of repetition from the Enumerable core methods, for example. Things like this make it possible to provide abstractions where you only need to specify what’s necessary, and nothing more.

And remember, the destructuring I’ve shown with dsyntax can be done exactly the same for macros and lecros. Regular methods doesn’t need it that much, since the rules for DefaultMethod arguments are so flexible anyway. But for macros this has really made a large difference.


7 Comments, Comment or Ping

  1. Another really eye-opening article. Thanks :-)

    It inspired me to play further with macros. After a little while I noticed strange behaviour that my model of what’s going on couldn’t explain. Perhaps you could shine some light on what’s happening in the following (I would have expected both calls a and b to produce the same output)

    λ ioke
    iik> m = macro(call)
    +> m:macro(call)

    iik> m(1,2,3+1) message code ;; call a
    +> “m(1, 2, 3 +(1)) message code .

    iik> m_message = m(1,2,3+1) message
    +> Message_0x3E7D948B:

    iik> m_message code ;; call b
    +> “m(1, 2, 3 +(1)) message”

    January 8th, 2009

  2. Sam:

    Actually, this does make perfect sense, really. The m-macro captures and returns the call information when it is activated. That means that the information returned from the m-macro will be the call site information at the place where the macro is activated.

    And what call contains is the message that originated the activation of the macro at that point. Which is why you can see the code for that in the first call to m. In the second case the activation saves away the resulting message, but that message still means the same thing, hence the last result.

    Clear? =)

    January 9th, 2009

  3. Wow this looks really great.

    January 9th, 2009

  4. Jason

    Interesting. These macros seem quite similar to the FEXPRs of older Lisp dialects.

    January 9th, 2009

Reply to “Macro types in Ioke – or: what is a dmacro?”