Data types

D

Why use Kaya? – Meaningful data types

Kaya’s powerful type system makes it easy to meaningfully represent complex data structures in your program, allowing many bugs to be found at compile time. Type inference means that the need to explicitly write types down is minimised.

Static typing with type inference

Kaya is statically-typed, which makes it much easier to ensure data integrity and catch potential mistakes at compile time rather than run time. Unlike most statically-typed languages, however, Kaya has type inference (and automatic type conversion where safe) to keep the extra typing needed to a minimum. In general, you should never need to specify types explicitly outside function definitions.

String printSquares(Int num) {
    output = ""; 
    for i in [1..num] { 
        // types of 'i' and 'output' are obvious, so no need to give them.
        square = i*i; // 'square' is another Int
        output += i+"*"+i+"="+square+"\n";
// because the type of the whole expression is String, and Int->String
// conversions are 'safe', there's no need to explicitly say 
// String(i)+"*"+String(i)+etc...
    }
    return output;
}

Algebraic data types

Kaya supports algebraic data types (ADTs), commonly found in function languages such as Haskell or Ocaml, but rare in imperative languages. These allow easy construction of meaningful data types to represent arbitrarily complex data.

‘Maybe’ in Kaya

A simple and common ADT is the ‘Maybe’ type, used to represent expected failure in a function.

Maybe<Int> strpos(String needle, String haystack) {
    if (length(needle) > length(haystack)) {
        return nothing;
    }
    for i in [0..length(haystack)-length(needle)] {
        if (substr(haystack,i,length(needle)) == needle) {
            return just(i);
        }
    }
    return nothing;
}

In this example, the return type when the needle is not found in the haystack is very different to the type when it is found. This allows code calling this function to be written in a way that can be at least partly checked at compile time.

For comparison, the C libcgi strpos() function always returns ‘int’, returning -1 if the needle is not found in the haystack. The PHP strpos() function is even more confusing, returning ‘false’ if the needle is not found in the haystack, and the dynamic typing then makes it easy to confuse ‘found at position 0’ and ‘not found’.

While both C and PHP have a way to distinguish the two cases, errors in implementation when a programmer uses the strpos() function will only be found at run-time. In Kaya, the use of the ‘Maybe’ type lets some obvious errors in implementation be caught at compile-time.

Representing complex data with ADTs

It is possible to define complex data types using ADTs, and so accurately and safely handle information. For example, the following code is used (with minor modifications) in the Kaya standard library to describe XML-like trees:


public data ElementTree([Element] elements, 
                        String name,
                        TinyDict<String,String> attributes);
public data Element = SubElement(ElementTree nested) |
                      CData(String cdata);

The content of an XML element is made up of a sequence of other elements and of ‘anonymous elements’ containing text. The Element data type allows these two possibilities to be represented meaningfully. It is then possible, for example, to traverse the tree to extract all element text:

String extractText(ElementTree el) {
    extracted = "";
    for sub in el.elements {
        case sub of {
            SubElement(nested) -> extracted += extractText(nested);
            | CData(cdata) -> extracted += cdata;
        }
    }
    return extracted;
}

By storing data in meaningful types, the code required to use the data can be kept short and easy to debug.

Recent Comments

No comments to show.

Pages