Defining data structures
Data structures can be defined by combining or providing alternatives between other data structures or primitive types. The simplest data definition is the ‘record’ type, which is similar to a C struct.
data Person = newPerson(String name, Int age);
newPerson
is a constructor function, which is otherwise treated like a normal function, and creates an instance of the Person
type. Type names must begin with a capital letter to distinguish them from type variables (more on type variables later). The Person
type can be used as follows:
Void main() {
x = newPerson("Harvey the wonder hamster",6);
putStr( x.name + " is " + x.age + " years old");
x.name = "Fred"; // Assignment works too
birthday(p);
}
Void birthday(Person p) {
p.age += 1;
}
As shown in this example, the ‘.
‘ operator can be used to extract parts of a type for retrieval or setting. There is no need, in the birthday
function to use the var
keyword, because as in the basic tutorial, the function parameter is only a shallow copy.
The namespaces for types and functions are separate, so the constructor function may be named after the type:
data Person = Person(String name, Int age);
which may, since the two are the same, be shortened to
data Person(String name, Int age);
The remainder of this tutorial looks at polymorphic and algebraic types, where the real power of Kaya’s type system is.
Polymorphic types
Where a type definition is not strongly dependent on the component types, a type variable may be used. Type variables must begin with a lower-case letter, and represent any type (though always the same type in the same instance)
data Pair<a,b>(a fst, b snd);
data SamePair<a,a>(a fst, a snd);
The Pair data type pairs together two pieces of arbitrary data. The SamePair
type is similar, but has the added restriction that both pieces must have the same type. While the type definition contains a type variable, any particular instance of the type has a fixed type – the following code is not allowed.
arr = [];
arr[0] = Pair(3,2.2);
arr[1] = Pair("foo",false);
// compile-time error because the types don't match
Polymorphic types in functions
Functions can also have polymorphic types. For example, this function will reverse any array.
[a] reverse([a] xs) {
pos=size(xs)-1;
x=0;
while(pos>=0) {
newxs[x]=xs[pos];
x=x+1;
pos=pos-1;
}
return newxs;
}
The allMatch
function from the basic tutorial can be rewritten like this:
Bool allMatch([a] xs, Bool(a) test) {
for x in xs {
if (!test(x)) {
return false;
}
}
return true;
}
The example of code using this function to determine if all integers in an array are odd will still work exactly as before.
A polymorphic function must not use any of its polymorphic arguments in a more specific way:
// this is not allowed
public Void printAnything(a val) {
putStrLn(String(val));
}
Types with multiple constructors
It is possible for a type to have more than one constructor function (the practical upper limit is 65536 constructor functions). These are known as algebraic data types (ADTs), commonly found in functional languages such as Haskell or Ocaml, but rarely found in imperative languages. They often provide a natural way to describe data where an object can take one of many different forms (for example, a binary tree can be either empty or have two subtrees).
A simple linked list can be defined as:
data List<a> = nil
| cons(a head, List<a> tail);
Note that this particular type is defined recursively. The vertical bar divides the two constructors:
nil
, which takes no parameters and therefore can have the brackets omitted.cons(a head, List<a> tail)
, which takes two parameters – the head element, and the rest of the linked list.
This data type can then be initialised – in this case to a list of integers – with:
list = cons(5,cons(3,cons(10,nil)));
This makes a three-element linked list. Obviously a list of arbitrary length could be initialised this way, but the numbers of brackets involved might make a clearer method appropriate!
Having put the data into the list, retrieving it requires telling the difference between the two constructors. The case
statement is used for this. The following function (recursively) prints a list of String
s.
Void printList(List<String> list) {
case list of {
cons(head,tail) -> putStrLn(head);
printList(tail);
| nil -> return; // stop at this point.
}
}
The constructors are listed in the opposite order just to show it makes no difference. You can miss out constructors, but this is a run-time error unless you have a default
case. The following function uses a default case to return a default value if the list is empty.
a getHead(List<a> list, a defval) {
case list of {
cons(head,tail) -> return head;
| default -> return defval;
}
}
Since the List<a> type has a parameter called head
, it’s possible to writelist.head
to extract the head element from a List<a>
called list
. If the list was constructed using the nil
constructor function, it won’t have a head element, and a run-time error will be thrown. Therefore, using case
statements is much better for types with multiple constructors.
If you wish to enforce this on users of your type, you can make the fields unnamed:
data Tree<a> = leaf(a)
| node(Tree<a>,Tree<a>);
Function overloading
Because the printing method for a variable depends on its type, we sadly can’t make a single list printing function for all types of list. However, you can make multiple list printing functions with the same name:
Void printList(List<String> list) {
case list of {
cons(head,tail) -> putStrLn(head);
printList(tail);
| nil -> return; // stop at this point.
}
}
Void printList(List<Int> list) {
case list of {
cons(head,tail) -> putStrLn(String(head));
printList(tail);
| nil -> return; // stop at this point.
}
}
Void printList(List<Float> list) {
case list of {
cons(head,tail) -> putStrLn(String(head));
printList(tail);
| nil -> return; // stop at this point.
}
}
Provided the functions have different types, this is allowed. This is currently only a convenience feature – it’s not yet possible to call printList
on a List<a>
and have it “do the right thing” when the program is run.
An example from the standard prelude
The following two functions from the standard prelude both return the absolute value of a number.
Int abs(Int x) {
if (x < 0) {
return -x;
} else {
return x;
}
}
Float abs(Float x) {
if (x < 0.0) {
return -x;
} else {
return x;
}
}
A call to the abs function will automatically use the right function for the type of the input or output.
i = abs(-5); // calls Int(Int) version
j = abs(-4.5); // calls Float(Float) version
Int k = abs(3.7); // error - there is no abs() function of type Int(Float)!
Some examples of data types
‘Maybe’ (representing possible failure)
data Maybe<a> = nothing | just(a val);
This type is defined in the standard Prelude, and is used to represent the possibility of failure. Because the nothing
value can be stored, this often makes more sense than throwing an exception. For example, a set of preferences between options may be expressed as [Maybe<Option>]
.
data Option(Int optid, String optname);
Void printPrefs([Maybe<Option>] preferences) {
for pref@idx in preferences {
putStr("Preference "+(idx+1)+": ");
case pref of {
nothing -> putStrLn("unspecified");
|just(opt) -> putStrLn(opt.optname);
}
}
}
Binary Trees
data Tree<a> = Leaf(a val)
| Node(Tree<a> left, Tree<a> right);
Directory Trees
data Content = File(String fname)
| Dir(String fname, [Content] subtree);