Rust is a modern programming language which is marketed primarily on the basis of its very nice type system, and I'd like to tell you about how you can use this type system to reason about your programs in interesting ways. Most of the time when its type system is discussed, the focus is on its guarantee of data race freedom and ability to enable so-called fearless concurrency (and rightfully so---this is a place where Rust truly shines!). Today, I have a different focus in mind, characterized perhaps most succinctly as follows:
From the type of a polymorphic function we can derive a theorem that it satisfies. Every function of the same type satisfies the same theorem. This provides a free source of useful theorems.
Philip Wadler, Theorems for Free!
If you're not the most mathematically inclined, don't be scared off by the word theorem! The quote is telling us that---with the right property of our type system---we can learn useful properties about generic (i.e. polymorphic) functions solely by inspecting their types. In the rest of this post, we'll cover this type system property, and a number of example properties we can derive from types as as result.[1] Much of what's covered can be generalized to languages aside from Rust, but (most) examples will be in Rust with Rust-specific aspects highlighted.
A Principal Property for Reasoning
The property at the heart of this style of type-based reasoning with generics is known as parametricity. Parametricity can be formulated as a mathematical theorem,[2] but it's best thought of intuitively as the notion that all instances of a polymorphic function act the same way. With this intuition in mind, you can imagine determining whether or not a particular function is parametric. For example, we can determine that the following Java function is parametric:
public static <T> T identity(T x) {
return x;
}
And that the following Java function is not:
public static <T> T notIdentity(T x) {
if (x instanceof Integer) {
return (T) (Integer) 42;
} else {
return x;
}
}
The reason for this is that the latter function has chosen to specialize its behavior based on the type of its parameter, rather than acting the same on all types. This cuts to the essence of parametricity: to write parametric functions, we must treat parametric types opaquely! While Java does not enforce parametricity (and in fact often encourages otherwise), other type systems like that of Haskell and Rust require all functions to be parametric.[3] When all polymorphic functions are parametric, the type system is said to be parametrically polymorphic---though in practice, many parametrically polymorphic type systems support some degree of ad hoc (that is, type-dependent polymorphism). In this case, we know that all polymorphic functions are parametric and we're able to learn some of their properties solely from their type. So, let's look at some examples in Rust.
Who am I? or: Reasoning about Identity
Consider the following function type, and try to imagine as many implementations as possible:
fn<T>(T) -> T
This type describes a function that for any type T
, takes an argument of type T
and returns a result of type T
. If you're already familiar with Rust, I'm sure it wouldn't
take long to come up with the following implementation, the identity function:
pub fn id<T>(x: T) -> T {
x
}
In fact, since there are no operations we can actually perform on x
, it's the only possible
return value for this function. Of course, since Rust is effectful, we could print something before
we return like so:
pub fn effectful_id<T>(x: T) -> T {
println!("oh no");
x
}
And Rust is also partial, meaning we could error (called panicking in Rust) or otherwise diverge:
pub fn panicking_id<T>(_: T) -> T {
panic!("at the disco")
}
pub fn diverging_id<T>(_: T) -> T {
loop {}
}
These various implementations all tell us something about what the type means, which we can phrase like so:
A function of type
fn<T>(T) -> T
must:
- return its argument or
- panic or abort or
- never return
Additionally, since we still know nothing about the type T
, we can conclude that any
effects that occur during the function are not dependent on the argument. With these two
properties, we can then conclude the more general properties that functions of the type
fn<T>(T) -> T
behave "like an identity function":
Given a function
id
of typefn<T>(T) -> T
, a total functionf
of the formfn(A) -> B
whereA
andB
are both concrete types, and a valuea
of typeA
, then either:
id
can be composed arbitrarily (e.g.id(f(a)) = f(id(a))
) orid(f(a))
andf(id(a))
both panic or diverge.
In order to conclude this, we can consider each of the cases we previously described. If the
function returns its argument, then we know both that id(a) = a
and
id(f(a)) = f(a)
and we can combine these two equalities to conclude the first result. If
the function does not return its arguments, we know it either panics or never returns but we also
know that this cannot be dependent on the argument in any way. Thus if id(f(a))
panics,
then f(id(a))
must panic as well.
With that, we've intuited (but have not formally proven)[4] our first "useful theorem" about a family of functions based solely on their type. While it's nice to know that identity-looking functions behave like an identity function, there's certainly nothing earth-shattering about the result. But the fact that we can apply this style of reasoning to every type ought to be compelling.
Vectors Abound
Let's look at a slightly more complicated type now, involving Rust's Vec<T>
type for
dynamically-sized buffers. We'll again follow the same formula of enumerating some possible
implementations before trying to conclude a general property. Given the type:
fn<T>(Vec<T>) -> Vec<T>
We can come up with implementations such as:
pub fn tail<T>(vec: Vec<T>) -> Vec<T> {
vec.into_iter().skip(1).collect()
}
pub fn reverse<T>(vec: Vec<T>) -> Vec<T> {
let init = Vec::with_capacity(vec.capacity());
vec.into_iter().fold(init, |mut acc, elem| {
acc.insert(0, elem);
acc
})
}
pub fn swap_first_two<T>(mut vec: Vec<T>) -> Vec<T> {
if vec.len() < 2 {
return vec;
}
let elem = vec.remove(1);
vec.insert(0, elem);
vec
}
We can then try to capture a sense of what this type means as we did before:
A function
m
(for mystery) of typefn<T>(Vec<T>) -> Vec<T>
must:
- return a
Vec<T>
that contains a subset of the contents of its argumentVec<T>
in any order. (i.e.∀v. {e | e ∈ m(v)} ⊆ {e | e ∈ v}
) or- panic or abort or
- never return
The process of concluding this is more complicated, but the general gist is that such a function can
only perform the operations defined on Vec<T>
and as usual cannot inspect the types of its
elements. From there, we know that we cannot create new values of type T
or perform any
operations dependent on values within the vector. This also leverages the Rust-specific fact that
values (in this case, of type T
) cannot be copied without knowing that they implement
Clone
and/or Copy
(whereas in other languages with parametricity, this typically
is not the case). We can then conclude that all functions at this type must yield a permutation (or
possibly a subset of a permutation) of the input vector. Of course, the same exceptions about panics
and divergence apply. Interestingly, we can reach a similar general conclusion to the one we reached
for fn<T>(T) -> T
:
Given a function
m
of typefn<T>(Vec<T>) -> Vec<T>
, a total functionf
of the formfn(A) -> B
whereA
andB
are both concrete types, anda
is a value of typeVec<A>
, then either:
mystery(map_f(a)) = map_f(mystery(a))
wheremap_f
is defined as|x| { x.iter().map(f).collect() }
or- at least one of
mystery(map_f(a))
andmap_f(mystery(a))
panic or diverge.
Noninterference for Free
Thus far, we've looked at rather simple properties of programs because it is easier to imagine the proof in your head. But now, let's take the opportunity to explore a security property called noninterference for which a number of tailored type systems have been built. The idea behind these type systems is typically that you annotate types and values in your program with labels indicating whether a value should be public or secret (some systems expand this with further labels, but just the two are enough for the basics). Noninterference then says that functions with public output cannot depend on private inputs. Fortunately, using parametricity, we can have this property for free in Rust![5]
To do so, first, we have to define a notion of secret (we'll treat all unannotated types as public, though we could choose to introduce a public type as well for symmetry):
pub struct Secret<T>(T);
Strictly speaking, we've now achieved noninterference! That was probably easier than you expected,
but the intuition should be clear: since we can perform no operations whatsoever on values of the
type Secret<T>
, it is impossible for public outputs to depend on secret data! However,
there is a caveat: because of how access modifiers work in Rust, code in the same module can violate
noninterference like so:
pub struct Secret<T>(T);
pub fn unwrap_secret<T>(secret: Secret<T>) -> T {
secret.0
}
To avoid this, we can place our implementation of secret types inside of its own module with no additional code:
pub mod secret {
pub struct Secret<T>(T);
}
use self::secret::Secret;
Now, we have noninterference enforced in any downstream code, but in real security type systems, you
can still use secret values to compute other secret values. To do this, we can use Rust's trait
system to add common functionality. We can use this to define a lot of operations, but some of the
operator-overloading traits (std::ops
) are not currently general enough making some code
less pleasant.[6] Here is our example with some ability to use secret values to compute other
secret values:
pub mod secret {
#[derive(Copy, Clone, Default)]
pub struct Secret<T>(T);
use std::ops::{Add, Sub};
impl<T> Add for Secret<T> where T: Add {
type Output = Secret<<T as Add>::Output>;
fn add(self, other: Secret<T>) -> Self::Output {
Secret(self.0 + other.0)
}
}
impl<T> Sub for Secret<T> where T: Sub {
type Output = Secret<<T as Sub>::Output>;
fn sub(self, other: Secret<T>) -> Self::Output {
Secret(self.0 - other.0)
}
}
// ...
}
use self::secret::Secret;
Now, we have some ways of using our secret data to construct other secret data. It's limited, but
many other extensions should follow similar patterns and we could also add other operations
implemented directly on Secret<T>
types that compose secret values without going through a
trait like so:
impl Secret<bool> {
pub fn branch<F, T>(&self, cons: F, alt: F) -> Secret<T>
where F: Fn() -> Secret<T> {
if self.0 {
cons()
} else {
alt()
}
}
}
With all these extensions, the argument that parametricity is still enforcing noninterference is now
dependent on the exact set of operations that have been implemented for Secret<T>
, but as
long as they always return an argument of the form Secret<T>
, Rust will enforce
noninterference. We can even include operations that combine Secret<T>
and T
as
long as their results are themselves secret. We could even imagine building a simple static analysis
tool that runs atop Rust to audit a crate providing such a secret type to ensure that every function
it implements returns a secret marked type.
Bountiful Properties with Bounded Parametricity
Though we used traits to extend the functionality of our Secret<T>
type, they played a
somewhat limited role in our argument for noninterference via parametricity, but we can do more.
Fundamentally, traits allow us to bound type parameters with a specific interface that can be used
within functions. This allow us to weaken our notion of parametricity from type parameters and
values at those types being completely opaque to values at those types being usable in a controlled
fashion. Correspondingly, we can derive even more interesting properties from the types. For a
simple example, consider this extended version of our original identity example:
fn<T>(T) -> T where T: Display
Previously, we said that any side-effects of this function could not depend on the argument. By
adding the Display
bound on T
, we've allowed the argument to be displayed in
output effects like println!
. In a sense, this new ability to display the argument is
expanding the allowed set of side-effects. This expansion is most evident from the fact that all of
our old implementations are still legal at this bounded type, but new implementations are also
legal. For example:
pub fn trace<T>(x: T) -> T where T: Display {
println!("{}", x);
x
}
You may have noticed as we went through our earlier noninterference example that this property seems
almost useless by virtue of being overly strict. In particular, since public outputs cannot depend
on secret values in any way, there's really no reason to use secret values at all. In practice,
security type systems offer escape hatches (much like Rust's unsafe
) to selectively reveal
secret information in a way that is readily auditable. With traits, we can build a principled escape
hatch giving us a weakened property known as relaxed noninterference.[7] Relaxed noninterference
can be understood intuitively as the property that public outputs can only depend on secret values
according to predetermined rules known as declassification policies.
In our formulation in Rust, we will record these policies as traits and use trait bounds to decide
what policies are available within a function. Consequently, the type signatures of our functions
will necessarily have to tell us how they plan on using the secret data we give them giving us
strong, local reasoning principles for security. At the heart of this approach is our previous
definition of Secret<T>
with a trait representing the empty declassification policy:
pub struct Secret<T>(T);
pub trait Sec<T>: private::Sealed {}
impl<T> Sec<T> for Secret<T> {}
mod private {
use super::Secret;
pub trait Sealed {}
impl<T> Sealed for Secret<T> {}
}
Our private module here is used to seal the Sec<T>
trait preventing it from being
implemented on any additional types beyond Secret<T>
. With just this, we can now specify
functions like before that have noninterference:
pub fn f<S, T>(x: u32, y: S) -> u32 where S: Sec<u32> {
// the following line is not legal...
// return y.0;
x
}
We can then specify a number of declassification policies that enable us to make selective use of our secret values:
// Debug declassification policy: can format the value for debugging purposes
impl<T> Debug for Secret<T> where T: Debug {
fn fmt(&self, f: &mut Formatter) -> Result<(), Error> {
self.0.fmt(f)
}
}
// Zeroable declassification policy: can determine whether or not this is zero
impl<T> Zeroable for Secret<T> where T: Zeroable {
fn is_zero(&self) -> bool {
self.0.is_zero()
}
}
// Hash declassification policy: can compute a hash of the value
impl<T> Hash for Secret<T> where T: Hash {
fn hash<H>(&self, state: &mut H) where H: Hasher {
self.0.hash(state);
}
}
And then we can use these declassification policies to discern legal implementations of specific types as we've done before. Consider the type:
fn<'a, S>(S, u64) -> bool where S: Sec<&'a str> + Hash
We know that there are some trivial implementations (e.g. comparing the u64
against
0
) that don't make use of the secret value, but what about implementations that do? We can
come up with something like:
pub fn check<'a, S>(password: S, db_hash: u64) -> bool
where S: Sec<&'a str> + Hash {
// please don't actually do this, use bcrypt or scrypt instead.
use std::collections::hash_map::DefaultHasher;
let mut hasher = DefaultHasher::new();
password.hash(&mut hasher);
hasher.finish() == db_hash
}
Now, if we connected this to a web framework (like the amazing Rocket), we could imagine
having our forms always providing passwords as secret values. Then, by using traits as
declassification policies, we can use the type system to ensure that we never accidentally misuse
the password. However, we should be wary: we used Hash
in this example because it's
provided by std
and includes already-implemented hash algorithms, but it's actually
overly-permissive for this purpose. We could write a custom hasher that would allow us to leak
information or even completely reveal the value. For a real implementation, we would instead provide
a more constrained trait that allows you to compute a specific cryptographic hash such as bcrypt or
scrypt.
Some Final Words
If you've made it this far, you've seen a bunch of "crazy academic concepts" like parametricity, free theorems, and noninterference. You've also seen how traits can be used to relax parametricity and give us even more useful free theorems. Hopefully, this endeavor has convinced you of the strength of type-based reasoning in Rust. The small examples that you've seen throughout the post are really just scratching the surface of this kind of reasoning: we can go further by using the added constraints from the ownership system to produce even more interesting theorems (such as that a cryptographic nonce is only used once). The extent of these reasoning capabilities is one of my personal favorite features of strong type systems, and subsequently one of my favorite things about Rust. Maybe it'll be one of yours now too!
Owing to their presentation in the paper Theorems for Free!, these properties are known in the academic world as free theorems---though I suspect that some will be unhappy with my liberal application of this term to intuited properties. ↩︎
And indeed, it was originally presented as the abstraction theorem in John C. Reynolds' Types, abstraction, and parametric polymorphism. ↩︎
Strictly speaking, Haskell's
seq
breaks general parametricity, as do Rust's various reflection capabilities (includingSized
) and the upcoming impl specialization feature. Fortunately, like in the Haskell paper, we can always refine our notion of parametricity. Though this does have some consequences for precisely what properties you can glean from types. ↩︎Though our argument is somewhat proofy, we would require a formal semantics for Rust. There exists one in the form of RustBelt, and as part of my research, I hope to produce an alternative formal backing for these free theorems in Rust, particularly the latter ones related to security. ↩︎
A connection between parametricity and noninterference was commonly held wisdom in the programming languages community, but was not proven until Bowman and Ahmed's Noninterference for Free. ↩︎
The consequence of this is that we would need to define methods instead of operators which would make secret code look weirder and be less ergonomic, but is not a fundamental limitation to this approach. If the trait definitions were made more general, this would be a nonissue, and we could use macros instead to offer some improvements. ↩︎
In Type Abstraction for Relaxed Noninterference, we see a related presentation of relaxed noninterference as a consequence of object-oriented type abstraction capabilities. Since Rust uses parametric polymorphism with traits for type abstraction, we are developing an analogue here. ↩︎