Wednesday, January 7, 2009

MultiReference Duck Typing?

I have been struggling with one of Preon's challenges for a while, and I haven't made up my mind about it yet. So I figured I would write it down here. Perhaps that helps.

So, here we go. First of all, if you don' t know anything at all about Preon, I suggest you read the introduction available at http://preon.flotsam.nl/. One of the sections in that document briefly discusses references in Preon. And those references are currently posing some challenges.

Let's just say this is your object model.


class A {
@Bound D valueOfD;
@If("valueOfD.value > 0");
@Bound int number;
}

class B extends D {
@Bound int value;
}

class C extends D {
@Bound String value;
}


Just to help you to decode the example above: if you would decode an instance of A, the first field encountered would be valueOfD. I left the annotations detailing how to decide between B or C out of here, for sake of simplicity. However, keep in mind that - since B and C are subtypes of D - valueOfD could either be an instance of B or C.

The next bound field in A is number. But that field will only be read if the value property of valueOfD is greater than zero. But wait a minute! The value property of the object referenced by valueOfD could either be an instance of B or an instance of C. So, that means that value is either an integer or a String. Which means that the expression 'valueOfD.value > 0' could either be valid or invalid.

Now, question is: what do you expect in cases like these? There are (at least) two options:

  • You expect the framework to generate an exception while creating the Codec. Since value can be either an int or a String, the expression might be invalid at runtime, and by raising an exception the user is aware of this potential problem.
  • You expect the framework to be able to deal with it. If there is an expression like valueOfD.value > 0, then the framework should basically assume that this is what the user means. So if this expression is only valid for instances of B, then the framework is expected to assume that this will be an instance of B, and not an instance of C. We only expect the framework to generate an exception if - at runtime - the instance of D referenced be valueOfD is not an instance of B.


I tend to lean towards the second option, but I wonder if I understand the consequences of all of that yet.

Now, I think the second option would be doable, and could be done fairly easily using the existing abstractions in Limbo, the language used in Preon. Whenever Preon encounters situations like the above, the Reference constructed from valueOfD.value is a socalled MultiReference. It basically is a Reference that captures the different options. We don't know exactly which of the options will hold at runtime, but we do know for sure it's either one of them.

One of the things that was built in recently is the option to narrow references to references that resolve to a certain type. The operation on Reference takes a type, and returns a new Reference that is guaranteed to resolve to an instance of that type. And if it is impossible to return something like that, it will return null.

This is something that seems to fit the above case fairly well. If a reference is used in an expression that assumes the reference to resolve to an integer type, we could narrow that reference to a reference of type integer, and be done with it.

The narrow operation will - in most cases - return the same reference or null. Only in case of the MultiReference it will have a serious impact. In case a MultiReference is pointing either to an int or a String, then narrowing it to Integer will generate a new MultiReference with the String reference dropped off.

So, given that almost all I need is there, what's the doubt? Hard to tell. Maybe gutt feeling. Or maybe it has to do with the fact that it is actually considerably different than what you have in other existing language. In a strongly typed language, your assumption on the receiving type would normally require you to add a cast. But adding a cast into an expression language that is expected to be easily readable doesn't seem quite right. Most languages don't allow you to leave this kind of uncertainty in.

... unless you consider dynamically typed language. But in dynamically typed languages, everything is evaluated at runtime. And you don't even bother keeping track of the different types in which a reference could be resolved.

Nevertheless, it would be sort of adding duck typing capabilities in a largely statically typed language. Duck typing to take away the ambiguity, rather than duck typing to avoid having to declare types.

I think I'm still in favor of the second approach. If only you have the slightest idea what this entry was about, and you feel differently, feel free to comment.

(O yeah, and I forgot to mention that I ran into this issue while trying to apply Preon on a parser for Java byte code. It's amazing what you run into when you try to disect bytecode. I can quite easily imagine that BlackBerry and Android are capable of turning a class in a more efficient representation...)

0 comments: