The Evolution of a Type

Apr 20, 2021 - 11 a.m.

Elm is a programming language for building web applications that I have grown to love over the past several years. It is fast, safe, and in my experience, fun to write. One of the standout features of Elm is its expressive type system. When teaching someone Elm, I worry that the depth of the types is lost on them, through no fault of their own, because in addition to learning types, they are usually also learning a new syntax, the specifics of the elm architecture, and how to deal with the language’s lack of mutation. To make things more difficult, Elm’s type system has some things in common with the object oriented type systems programmers are traditionally taught, but is quite different.

I was laying in bed last night and I thought maybe stepping through the definition and usage of a type as it evolves from simple to complex would make working with types a little clearer to the new Elm programmer.

So that’s what I’m going to do. We’re going to walk through making a user type and hopefully give you some insight into the power that Elm types provide and also just how they work. This guide assumes you are familiar with basic Elm syntax. (If you are not but still interested in Elm, I always recommend the Official Tutorial as a good starting point.)

Lots of web applications have some concept of a user, let’s assume in our case that our user has an email address and an id represented by a Guid. Hopefully sounds familiar? Cool. When we first start working with our user, we might just need the id, which we could store in our model.

    type alias Model = {
        userId : String
    }

   model = { userId = "726f27bf-6b3d-4421-8c26-9b77c6a57360" }

The Model type here is a record (kind of like a dictionary) and we’re storing our userId as a String. Technically it’s a guid, but we can store a guid in a string since it’s just numbers and letters. How would we use this? Let’s write a function.

    displayUserId : String -> String
    displayUserId userid =
        "User Id is: " ++ userid

That’s ok, but could we make it a bit more explicit what kind of value this function takes? And what kind of value we’re storing in the model? Sure we can.

    type alias UserId = String


    Type alias Model = {
        userId : UserId
    }

    displayUserId : UserId -> String
    displayUserId userid =
        "User Id is: " ++ userid

Now we’re using an alias for userId, which is like giving a nickname to the built in String type. What does this do for us? Well, it makes the Moel and type signature for displayUserId much more explicit. Those aren’t just any strings! They’re user ids. Unfortunately, the utility of this technique mostly stops here. Although we say it’s a UserId in type declarations, when using an alias for String, we are still free to use any String we want. If another string makes its way to the call site of displayUserId, it will happily use it, and we’re not taking advantage of the safety Elm provides us. This is pretty close to how you’d do it in plain old javascript. Let’s make it better.

    type UserId = UserId String

Ok, we lost the "alias" part of our type definition. We’re now making real types! This is a type that exists only in our application, that makes sense because the users of most applications are different (ignoring that this example user is very generic obviously).

Our type is called UserId and so far it has a single type constructor, which is also called UserId. It does not have to be called UserId but in this case it makes sense. The type constructor is used to make values that are of type UserId. The UserId type constructor takes a single String as an argument. This is where we depart drastically from OO types and where I see people getting lost so maybe re-read this paragraph. Let’s try out our new type in the model.

    type alias Model = {
        userId : UserId
    }

    model = { userId = UserId "726f27bf-6b3d-4421-8c26-9b77c6a57360" }

Our userId is now the type UserId (no longer alias), and we created a value of that type by passing a guid string to the constructor. This also changes how we use the value in functions.

    displayUserId : UserId -> String
    displayUserId userId =
        case userId of
             UserId idString ->
                "User Id is: " ++ idString

Because we are using our own type, we have to unwrap it in order to use the string that it contains. What we’re using in this case statement is called pattern matching and we’re matching against the name of the type constructor and putting the String value into the variable idString so we can use it. This might seem verbose, but that’s mostly because we have a single type constructor at the moment. We’ll add more in a minute.

The other thing you might be thinking is "it’s still a String, why all the rigamarole?". Well, when our UserId was still a plain String, a string made anywhere in the app could be used in its place. Not super safe. By requiring the type constructor to make UserId values, we force other developers who use our function to explicitly create UserIds. This protects us from many bugs.

Now that the UserId type is giving us some sense of protection, maybe we can also use our type to more fully represent the user data. You might recall our users aren’t just an id, we also keep their email address. Also: sometimes people who use websites aren’t logged in. We could use an empty string for id when users aren’t logged in, but that doesn’t make explicit what we’re actually doing. So let’s make some changes to our type.

    type alias UserId = String
    type alias UserEmail = String

    type User = LoggedInUser UserId UserEmail
              | AnonymousUser

The type has changed from UserId to just User, and now it has two type constructors. Both of them produce values that are of the type User. One is called LoggedInUser and it takes two values (both String aliases) that represent the fields we care about for a logged in user. The other is called AnonymousUser, it takes no arguments because we don’t have any data on a non-logged in user.

    --- we know our user
    user =  User "726f27bf-6b3d-4421-8c26-9b77c6a57360" "user@example.com"
    
    --- just someone who walked in off the street
    user = AnonymousUser

Explicitly making these parts of the type makes it clear what is happening in our code, and it forces us to handle all the potential states our code can be in when we use the type. Let’s see how this bears out in our displayUserId function.

    displayUserId : User -> String
    displayUserId user =
        case user of
             User idString _ ->
                "User Id is: " ++ idString
             AnonymousUser ->
                 "User is not logged in"

Now the case statement makes more sense. Every constructor for the type needs to be handled when we use a case statement, so in order to use User, we now have to handle every variant. This makes our code more resilient and representative of the problems we are solving. Freely using types to represent the actual state of application is something we can get a lot of mileage out of.

Every step along the way here improves upon the last (and they are all better than dynamic types) and if you want to stop here that’s a great start. However, you might be saying to yourself: "just because the id of the user is wrapped in a type doesn’t prevent a sloppy developer from sticking a random string into it!" And you are correct.

So how do we make our User type even safer? The answer is through encapsulation. We stick our User type into a module and we make it accessible only in safe ways. An OO parallel would be the public/private API of a class.

    module User exposing (User, createUser, anonymousUser, displayUserId, getUserEmail)

    type alias UserId = String
    type alias UserEmail = String

    type User = LoggedInUser UserId UserEmail
                      | AnonymousUser

    createUser : UserId -> UserEmail -> Maybe User
    createUser id email =
	if (isValidId id) and (isValidEmail email) then
      		Just (User id email)
    	else
		Nothing

    anonymousUser : User
    anonymousUser = AnonymousUser

    getUserEmail : User -> String
    getUserEmail user =
        case user of
            LoggedInUser _ email ->
                  email
            AnonymousUser ->
	      "anon@nowhere.com"

   --- isValidId, isValidEmail, displayUserId left to the imagination

The most important thing to note first here is the exposing line at the top of the module. We expose the type User, but explicitly do not expose the constructors for the type (that would look like User(..)). This means that code outside of the module can use our User but cannot create instances of it with the constructors, this also means it cannot pattern match against those constructors. The only way to create Users is through the createUser function, which you can see has added checks to assert the strings handed to it are valid values for id and email. I’m using a Maybe to wrap the return value and returning Nothing when the input is invalid, but you could get feisty and use a Result to better effect.

If you can’t pattern match against the constructors, how do you use the type though? The module needs to expose ways of getting the values out of the User. In this case those are displayUserId and getUserEmail. If the integrity of your data is important, this is a reasonable hoop to jump through. This also gives you freedom to change the underlying shape of your type without having to rewrite all the code that touches users. Decide that User should be a record instead? The id should change into a real Guid type from String? You only need to change code inside the module.

Is it true that someone on your team could still make unsafe Users by exposing the constructors? Sure, but at some point you need to trust your coworkers, and hopefully that sort of thing would get nabbed in a code review. If you’re writing a library you don’t need to worry about that.

This is just the tip of the iceberg when it comes to types, but if you can understand everything I’ve talked about here you’re well on the way to using them to great effect.

Thanks for reading!
This article owes a lot to Evan Czaplick’s talk The Life of a File, which is 100% worth a watch if you’d like to learn more.