During the last couple of months, I have been going back and forward on how to deal with slicing while encoding. If it comes to clearing your mind, nothing helps as explaining it to others (well, perhaps there are some things that help even better), so here goes nothing:
Slicing
First question: what is slicing anyway? In Preon, slicing is basically creating a partial view on a BitBuffer. If you happen to have a BitBuffer with 512 bits, and you slice it, what you get is a new BitBuffer. If your current position is on bit 9, and you create a slice of 16 bits (hardly realistic, but it's just easier to work with smaller numbers), then you get a BitBuffer that contains 16 bits. On position 0, it will have the value of position 9 in the original BitBuffer, and so on.
Use of Slices
Slices are useful in situations in 1) you have a collection of things and 2) you can't predict how many items it has, but you can predict how much space it occupies in total. Basically, if you first slice the BitBuffer, you can basically just keep reading until you the end of the BitBuffer. In reality, the BitBuffer can contain way more data, but you can never read beyond the end of the slice.
The Size of the Slice
There are essentially two different ways to define the size of the slice: the size of the slice could either be a static value, or it can be defined in terms of data read upstream. As an example of the second case: there might be a situation in which the size of the slice is defined by a value encoded as an int right in front of the start of the actual slice.
Ways to Define a Slice
Out of the box, Preon provides two different ways to define a slice. There's a '@Slice' annotation, and a '@LengthPrefix annotation. The second annotation existed before the first one, and is not as powerful as the first one. So while I was thinking about all of this, I did consider dropping @LengthPrefix entirely.
So how does it work?
The @Slice annotatation is defined like this: @Slice(size="..."). In this case the size is a Limbo expression, so it can either be a static value or an expression that incorporates data read upstream.
The @LengthPrefix annotation is defined like this: @LengthPrefix(size="...", endian={ByteOrder}). In this case, the size is not the size of the actual slice, but the size of a value represented as an integer that holds the size of the slice. (See the example below.)
So, you could say that with @Slice you can do everything that can be done with @LengthPrefix and a little more, but not the other way around. So far, there doesn't seem to be a reason to keep @LengthPrefix any longer; it seems we can just deprecate it. But there is a little more to that. Read on.
Encoding
Encoding a slice is in a way more complicated than decoding it. Suppose that the slice is defined to have a fixed size. In that case, the size of the data to be encoded within that slice should not be bigger than the size of the slice itself.
In case the size of the slice is a function of data read upstream, then it becomes even more complicated. If the data within the slice requires a certain amount of bits, then that function needs to result in that number of bits. Consequently, the variables used in that function need to have proper values, but since those variables are read from bits upstream, those bits need to have the proper value as well. So, the space required by downstream data requires upstream bits to have a certain value.
It's imported to note that this is a specific case of some challenge that has been reported before, in a previous post. The @LengthPrefix shines a different light on it though.
The following two snippets are essentially the same:
// Example 1:
@BoundNumber(size="8") int size;
@Slice(size="size") @BoundList byte[];
// Example 2:
@LengthPrefix(size="8") @BoundList byte[];
The Codecs generated by Preon will be almost identical. The main difference is that, in the first case, the size attribute is something that is accessible by the client. In the second case, it is data that will be read from the BitBuffer, but it will never be exposed to the Preon user.
Because of the information hiding in case of @LengthPrefix, it will be easier to preserve the dependency between the size attribute and the actual size of the slice. Encoding the slice in case of @LengthPrefix could work like this:
- Determine if the size of data structure from the slice can be determined before it is encoded. In case it can, simply write the size of the slice to the BitChannel first, and then continue writing the data structure that needs to be encoded.
- Otherwise, write the entire data structure to a temporary BitChannel backed by an in memory buffer. Then, before flushing the contents of this buffer into the target BitChannel, determine the required size of the buffer, and write that value to the target channel first. Then write the contents of the temporary buffer into the BitChannel.
So, this will guarantee that the encoded representation will always be correct. Something that is currently impossible when using @Slice. For @Slice, we it is possible to write to a temporary buffer, and determine the required size, but then there is not much we can do with that information, since it's data that could have been changed. (And having the ability to preserve dependencies between fields that can be changed is currently deliberately left out of the Preon roadmap.)
Summary
Trying to summarize my thoughts. Although there is some benefit in using @LengthPrefix when encoding, I'm still considering taking it out. @LenghtPrefix is only going to be a benefit in a few cases, whereas the problem that manifests itself when using @Slice exposes a challenge that 1) is way harder to solve, but also 2) much more awarding to solve.
If I take that route, then there will not be a reason for having the ability to write the slice to a temporary buffer to determine it's size. It's possible to write to a BitChannel directly, and then throw an exception if the number of bits required exceeds the size defined by the slice.

0 comments:
Post a Comment