Variable Length Arrays

1) Intro

There are many C libraries, interface description languages, file formats and
network specifications that define variable sized structures. In order to
interact with these, the LDL must be able to represent structures with variable sizes.
This document describes some additions that could be made to the
existing LDL and Layout Runtime to support variable-length arrays (VLA).

This document introduces only one kind of VLA, a fixed header repeating tail array.
This is a very basic approach but it highlights a few important issues with variable
sized layouts. Future documents will expand on this and introduce more complicated
types of VLAs with fewer restrictions.

2) Fixed header repeating tail

This type of VLA provides an access pattern for a single dimension of
uniform repeating types, where the number of elements is determined by a value in a
separate field (repeat count field) in the preceding fixed header.

Each array element is accessed with an array base address and an offset calculated
by multiplying the element size by its index. This means that in cases where the
element size is not padded to a hardware container granularity, container properties
such as atomicity can not be enforced.

Rules:

3) LDL changes

In order to support VLAs the LDL must be able to describe the field that determines
the length of the VLA. Below is an example with our existing layout prototype.

Example 1: Basic variable sized layout
"A",                //enclosing layout
"w:jint:4",       //<-- start of fixed header
"x:jint:4",        //repeat count field
"y:jint:4",        //<-- end of fixed header
"z:jbyte[x]:0"; //VLA with size determined by field 'x'

Note: we are not proposing this syntax, this is only for demonstration purposes.

4) Layout Interface changes

The following interfaces are introduced to support VLAs. These are in addition to the
interfaces included in the layouts prototype.

The VLArray type is used to represent a VLA inside a layout. It is not possible to
instantiate this type on its own, an instance can only be obtained by a getter method
from the enclosing layout.

public interface VLArray<T extends Layout> extends LayoutType {
    //no public factory methods
    //T is the type of the array element
    
    //return the value of repeat count field
    long getVLALength();
    
    T at(long index) throws arrayindexoutofboundsexception;
    
    void put(long index, T val) throws arrayindexoutofboundsexception;
    
    //sizeof VLA in bytes
    long sizeof();
}

The following type, "[primType]VLArray", will not be required for specification. It is an
inelegant part of the prototype because of java limitations which will be removed by
support for Generics over Primitive Types.

public interface [primType]VLArray extends LayoutType {
    //no public factory methods
    //[primType] is the type of the array element
    
    //return the value of repeat count field
    long getVLALength();
    
    [primType] at(long index) throws Arrayindexoutofboundsexception;
    
    void put(long index, [primType] val) throws Arrayindexoutofboundsexception;
    
    //sizeof VLA in bytes
    long sizeof();
}

A new "bindLocation(Location loc, long repeatCountInitializer)" is added to LayoutType.
This method allows a user to specify a value that will be used to initialize the repeat
count field when the layout is bound to a location. Also, "isVarSized()" is added to
indicate whether the layout is a variable size structure or not.

public interface LayoutType {
    
    //Returns true if contains VLA
    boolean isVarSized();
    
    //if layout is variable sized this operation
    //will read repeat count field value when binding to location
    public abstract void bindLocation(Location loc);
    
    //Initializes repeat count field of VLA with repeatCountInitializer value.
    //If the layout is not a variable sized layout this method throws an exception
    public abstract void bindLocation(Location loc, long repeatCountInitializer)
        throws UnsupportedOperationException;
}

5) Key Questions

5.1) Is the repeat count field writable?

Answer:
The repeat count field is read-only. The behaviour of the simple case should be
consistent with future extensions (ie. VLA followed by VLA). Allowing the repeat
count field to be writeable would make it possible to introduce gaps in the layout
or overflow memory allocated for the VLA.

---------------------------------------
Question to Spec Group:
Should we track capacity or not?
In Java there are many ways to alias data. Even if we make the repeat count field
read-only by only generating getters for it, it doesn't guarantee that the value will
not be modified by other means. Keeping track of capacity would help to protect against
cases where aliasing could cause modification of data that leads to overflows, but this
increases memory overhead.
---------------------------------------

5.2) How do we validate the size of a variable sized layout when binding to memory?

Answer:
In the case where the size of the VLA is not known (bindLocation(Location loc);) the
operation will first determine if the fixed header fits within the memory allocated for
the layout. If this is true, the repeat count field is read making it possible to do
bounds checks on the entire layout. In the other case
(bindLocation(Location loc, long repeatCountInitializer);), the size of the entire layout is
known at bind time so bounds checks are performed on the layout.

5.3) Is length a property of the data or of the interface?

Answer:
The design decision is that the length is a property of the data and not the interface.
This means we can not optimize the instance of the interface (bounds checks of the
accessor) based on the length. The reason for this decision is that we don't want to
require that the data be read before the interface is instantiated.

6) Summary

The "Fixed header repeating tail" VLA is very simple and covers a
limited set of use cases, however, these use cases are likely the most common.
This this type of VLA is not very useful for complex variable sized
structures like a Java Class File. This document serves as a starting point to begin
discussions on VLAs, future documents will introduce new types that address a
wider range of use cases.