Type Systems are Logic
The Curry-Howard-Lambek-Scott correspondence famously shows that type systems and logic are the same kind of thing. Another way of looking at type systems is that they are a form of abstract interpretation. There are countless other ways of thinking about types, so I thought it it would be useful to write down some of my own ideas on the topic in the language that I personally find useful for reasoning about code!
A Real-World Example
Recently I've been using Go's syscall/js
package. This is a
low-level library that allows Go code compiled to wasm
to
interact with Web APIs when run in the browser. One
major challenge of working with syscall/js
is that it does
provide type-safety, meaning it's very easy to get runtime
errors by passing an argument of the wrong type.
The following code will generate an invalid Value
error at
runtime:
type Value = js.Value
func alert1() {
var window Value
window = js.Global().Get("window")
window.Call(
"addEventListener",
"load",
func(this Value, args []Value) any {
window.Call("alert", "The window has loaded!")
return nil
},
)
}
So what went wrong? The variable window
has type js.Value
,
and its method Call
has signature (m string, args ...any) js.Value
.
But really this is misleading, because you can't pass just
any
arguments--the type of args
you need to pass depends
on the JS function being called and sending the wrong type or
wrong number of arguments will cause a runtime error!
In this case we're calling the JS function window.addEventListener
,
which accepts a single JS callback. However a go func
type is not a
automatically converted to a JS callback. We need to explicitly
create a js.Func
using the js.FuncOf
method.
func alert2() {
var window Value
window = js.Global().Get("window")
window.Call(
"addEventLister",
"load",
js.FuncOf(func(this Value, args []Value) any {
window.Call("alert", "The window has loaded!")
return nil
}),
)
}
Except... oops, there's no method addEventLister
! If you look
carefully there's a typo: it should be addEventListener
.
It gets very easy to lose track of all these little details,
especially if you think about a codebase with thousands and
thousands of lines of code. The fundamental problem is that
the syscall/js
package needs certain consistency guarantees
that are not represented in its use of types. What we need
is a wrapper for this package that ensures correctness by
using the type system so mistakes can be caught before
compilation rather than during runtime.
Here's an example of a type safe wrapper for window
:
type Window struct {
Value
}
type Event struct {
Value
}
func GetWindow() Window {
// no typos, so this always return the
// JS Value of the window
return Window{js.Global().Get("window")}
}
func (w *Window) AddEventListenerLoad(cb func(e Event)) {
w.Call(
// No typos here!
"addEventListener",
js.FuncOf(func(this Value, args []Value) any {
// Web API spec guarantees first argument
// will be a generic JS Event type
event := Event{args[0]}
cb(event)
return nil
}),
)
}
func (w *Window) Alert(message string) {
// JS alert method expects a string
w.Call("alert", message)
}
func alert3() {
window := GetWindow()
window.AddEventListenerLoad(func(e Event) {
window.Alert("window has loaded!")
})
}
Assuming our wrapper code is implemented correctly, we
now have a reusable, type-safe interface for calling
alert
on the JS window element when the window loads. In
fact, there is an entire project called gopherjs
that
tries to do exactly this: provide type-safe and correct
wrappers for just about every Web API.
Types Assert Logical Consistency
By restricting the ways in which a function can be called, we can reduce and often eliminate incorrect execution paths.
The wrapper methods in the syscall/js
example above
effectively assert, "If you give me values of this type,
I can always correctly execute the wrapped method." This is
in fact a logical statment! The arguments quantify over the
set of values for which correct execution is guaranteed, and
the body of the method transforms those values into the ones
that will work correctly with the wrapped method.
In the Curry-Howard perspective, the function call signature is a theorem that says "correct execution is guaranteed for values of this type" and the method body is a proof that demonstrates how correct execution is performed for those values. In the abstract interpretation perspective, the method type is a set of values and the method body describes an abstract execution trace for elements of that set.
At the Level of Assembly
Types allow modern programming languages to work. When working
with assembly, there is no notion of type. All you have are raw
bytes, and you must be very careful to remember what those bytes mean
when you are writing code. When working in a typed language, though,
the type system can take care of some of this reasoning for you. You
won't ever accidenally try to add the raw bytes of an integer value
to the raw bytes of a floating point value, because the type system
remembers x
is an int
and y
is a float
and so x + y
doesn't
make sense (unless the language you're using is smart enough to
do type conversion automatically!).
Type Definitions as Logical Guarantees About Memory
Most languages provide a means of constructing new types.
type Dot struct {
X int
Y int
}
From a logical perspective, this says that a segment of memory
that has type Dot
will always contain two integers. For
example if the type system knows that address 0xff0000
is
the start of a Dot
value, then 0xff0000
is the start
of the first int
type and 0xff0000+0x10
is the start of the
second int
type. The type system also knows how long the
memory segments for these values are based on their type. I just
made up these numbers and different languages have different
models for storing data in memeory, but the general idea holds
true: a type definition tells the compiler what the memory
layout looks like.
If you're working with assembly and don't have a type system, then you are responsible for remembering how memory is laid out. This is pretty unworkable, so basic type systems and type definitions are necessary for working with large programs.
Investigating Various Modern Programming Languages
Type can be used for all sorts of things. In fact, some type theories are expressive enough to represent arbitrary computation and hence arbitrary logic. Here's a few examples of some interesting choices made by various programming languages in their type systems:
TypeScript's "Type Predicates"
type MyType = {
doesTheThing: () => void;
hasTheThing: int;
}
function isMyType(t: T): t is T {
return
(t as T).doesTheThing !== undefined
&& (t as T).hasTheThing !== undefined
}
function main() {
// value has type any
var value: any = {
doesTheThing() { console.log("hoopla!") }
hasTheThing: 42
}
if (isMyType()) {
// type checker now knows that
// value has type MyType
value.doesTheThing()
}
}
Type predicates allow the type checker to narrow the scope of a type. In TypeScript there is a notion type polymorphism, where a value can have multiple types. This allows the value to be used in functions that accept any of its types. However in some cases you may want to perform some runtime check to determine if a value meets the conditions necessary to be considered as having a certain type. If that check succeeds, type predciates allow you to make a guarantee to the type checker that the value has that type. Without the predicate, the type checker will continue to assume that the value has the original type.
Rust's "Borrow Checker"
Yes, the quintessential selling point of the rust programming language is its borrow checker. The borrow checker statically checks to make sure the a value has not been freed from memory before its use.
fn f() &Int {
let x = 7;
return &x // error! value `x` does not live long enough
}
fn main() {
f()
}
In this example the type system is able to determine that
the value x
is referenced after its underlying value
has been dropped. In rust, variables are dropped as soon
as they go out of the scope, so in this case x
is dropped
right before the return
statement of f
. Therefore any
reference to x
that outlives this scope will be invalid!
Go
Go allows for multiple inheritiance, which is a form of polymorphism.
type MyType struct { }
func (m *MyType) DoTheThing() {
// ...
}
type MyOtherType struct { }
func (m *MyOtherType) DoTheOtherThing() {
// ...
}
func MyStruct{
MyType
MyOtherType
}
func main() {
v := MyStruct{
MyType{},
MyOtherType{},
}
v.DoTheThing()
v.DoTheOtherThing()
}