Type Systems for Beginners: An Introduction

Gilbert

What is a type system? Why are they useful? Knowing the basics of type systems will make you a more educated developer who is more aware of what exists outside the realms of your favorite language. In this post we will explore the fundamentals of types and type systems.

In particular, we will explore:

What is a type?
Why are types necessary?
What is a type system?
What is a static type system?

1. What is a Type?

In practice, a type can refer to one of two things:

A specific shape of data, or
A specific transformation of data.

Data Shapes & Concrete Types

When we work with data, we commonly ask ourselves: is it a string? a number? an array? All these questions are fundamentally asking what the "shape" of our data is. In other words, we are asking, what does our data look like?

Any experienced programmer will tell you it is essential to understand the shape of your data before using it. After all, if you don't know what you have in your hand, how are you going to know what to do with it?

Concrete types (formally known as a "proper" types) are a formal description of this shape of data. Concrete types include types you're familiar with, such as String, Boolean, and Number. They also include more complex types, such as Array[String], which means "An array of strings", and { username: String, score: Number }, which means "An object with a property 'username' whose type is String, and a property 'score' whose type is Number".

The main thing about concrete types is they are, well, concrete. There's no ambiguity about what they are; when you see Array[Number], there is no question the type means "An array of numbers".

Data Transformations & Function Types

A transformation of data is described by a function type. A function type is a set of input types and an output type. For example, take the following code:

function getLength (str) {
  return str.length;
}

function multiply (a, b) {
  return a * b;
}

In this example, getLength has the function type (String) => Number. That is, the function receives a String as its input, and transforms it into a Number as its output.

Similarly, multiply has the type (Number, Number) => Number. That is, the function receives two Numbers as its input, and transforms them into a single Number as its output.

As you may have guessed, a function type's input and output types are most commonly concrete types (String, Boolean, etc.).

2. Why are Types Necessary?

Every programming language needs some awareness of types in order to avoid running into an unrecoverable crash. Take the following code:

function name = "Alice";
var result = name();

When we run this code, JavaScript will throw an error when it reaches line 2 – internally, it knows that calling a string just doesn't make sense.

Throwing an error is a good thing. If JavaScript did not throw an error, very strange things would happen in your program, causing it to crash at some ambiguous point later on. And it would be impossible to debug.

Types allow a programming language to detect bugs in your program before things go horribly wrong. Exactly how and when this detection takes place depends on the type system.

3. What is a Type System?

A programming language's type system specifies what kind of types it has available, but more importantly it specifies how those types are allowed to interact. If you try to run code that does not follow these type rules, the language throws a specific kind of error – a type error.

To demonstrate this, let's consider some code in two different languages. JavaScript and Python each have different type systems, and therefore have different rules for how strings and numbers can interact. Let's explore how.

In JavaScript, adding a string and a number is allowed; it converts the number to a string automatically:

> "You scored " + 10 + " points!"
"You scored 10 points!"

In Python, this is not allowed; you must convert 10 to a string before adding it to another string. If you tried the above code in Python, you would get a type error:

>>> "You scored " + 10 + " points!"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects

As you can see, a programming language's type system has a big impact on how you interact with that language. Can a number auto-convert to a string? Can an array contain items of more than one type? Can a function receive more than one type? What about returning different types depending on input? All these rules must be predefined by the type system.

4. Static Type Systems

Every language has a type system. However, type systems are commonly categorized based on one important aspect, summarized by one important question:

Should the type system detect type errors before I run my code?

The term for detecting type errors is known as type checking. If a type system type checks before you run your code, it is considered a static type system. If the type checking happens at runtime (like JavaScript), it is instead considered a dynamic type system.

So when should type checking happen? Intuitively, the answer is beforehand, of course! Static typing forces newly written code to be compatible with existing code. Not only does this make working with large code projects a lot easier, but it also saves a lot of time by catching silly mistakes before they happen.

Example 1: Working with Player Data

Here is a small example of how static typing can help you be more productive. Let's say you have an array of players:

var players = [
  { nickname: 'alice', score: 10 },
  { nickname: 'bob', score: 8 },
];

...and you want to get print each player's score to the console:

players.forEach(function(p) {
  console.log("Player", p.username, "has score", p.score);
});

If you run this code with JavaScript, in the console you would see this output:

Player undefined has score 10
Player undefined has score 8

Woah, what happened? A silly mistake: we accidentally wrote p.username instead of p.nickname. With a static type system, however, we would get a type error before the code runs at all. The error message would tell us that username is not a property of the p object.

Example 2: Adding a Player

Let's extend the previous example to add a new player:

players.push({ name: 'Johnny', score: 100 });

Seems harmless, doesn't it? But wait – if we were using a static type system, we might see an error like this:

Cannot push { name: String, score: Number }
into Array[{ username: String, score: Number }]

Woah, what happened? We tried to add an object with a different shape than all the other objects in the array. Most likely, this is not what we wanted to do. Fortunately, most static type systems would prevent us from doing it :)

This may be a simple example, but imagine if the players.push code was in an entirely separate context, deeply nested within other functions in other files. At a certain point it's hard to keep all your data shapes in your head at the same time. Static typing helps keep all of this in check, and – as a side effect – makes changing and refactoring your code a much better experience.

Misconceptions Against Static Typing

The arguments of static vs dynamic typing have been debated for quite a while. Although there are some valid arguments against static typing (which we will save for a later post), there are quite a few commonly believed arguments against static typing that, while perhaps were true at one point, are no longer valid.

Myth: Static typing is verbose

In other words, "static type systems force me to write all my types". This is true in languages like Java – indeed, many veteran developers have been scarred by Java's verbose and unsound type system. However, modern static type systems have a much better feature called type inference.

Type inference is the the act of the type system automatically determining the types of your code. For example, take the following function:

function increment (x) {
  return x + 1;
}

In Java, we would have to declare all our types for this function, i.e. public static int increment (int x). However, in a modern type system (such as Haskell's or Elm's), this function would automatically be inferred as having type (Number) => Number, mainly derived from the fact that we add + 1 to x, therefore x must be a number.

Myth: Static typing replaces testing

Surprisingly, I've heard this argument from both sides. Either way, it's not true. Static typing does help you write compatible code, but not necessarily correct code; to remain a responsible developer, you still need to write tests.

With that in mind, static typing does remove the need to write an entire class of basic tests. For example:

function prependData (array) {
  return [10,20].concat(array);
}

For the above code, you don't have to write tests to see if it handles bad input, such as prependData(null) or prependData(123) – the type system prevents such code from running in the first place.

Conclusion

There is much more to talk about, but we have covered enough for an introductory post. Let's review what we learned:

We learned about data shapes and concrete types, and how they relate to function types (transformations)
We explored how some kind of type system is necessary in every programming language
And lastly, we explored static type systems and how they can benefit us as programmers.

Static typing is something that takes time to get used to, but the benefits are grand. I hope this post helps pique your interest in the strength and safety of static type systems :)