Please Buy now at Leanpub
Written and illustrated by GetContented.
Published on 2024-05-24.
Feline Crescent has ten houses on it. Each house has a cat of a different breed living in it. We’ll be seeing sections of programs that we’ll slowly build up to a program that finds the oldest of the cats on the street and prints out where they live, along with their age in equivalent human years.
First, let’s see a data type for cat breeds:
data CatBreed =
Siamese | Persian | Bengal | Sphynx
| Burmese | Birman | RussianBlue
| NorwegianForest | CornishRex | MaineCoon
If you remember, we know this sum type means we’re declaring a new type of data called CatBreed
, and all the possible values are listed above.
Next, we’ll see a data type for Cat
. This is a new kind of data type called a product type. This lets us make values that combine more than one type. When we saw tuples, you may remember we said that there are better ways to combine multiple values into one value. Well, this is that better way. A Cat
can have an Age
, and a Name
, and a breed (CatBreed
).
type Name = String
type Age = Integer
data Cat = Cat Name CatBreed Age
This tells Haskell that Cat
is a new type for data, and that it has only one value constructor. Both sum types and product types are examples of algebraic data types. You can have combinations of these types in the one type, which means you can have types that are sums of product types. We’ll see more of this later, so don’t worry too much about this for now. Algebra is another fancy word which just means a sort of language of combining elements together — in this case, we’re combining types. You probably know this word from Maths if you’ve studied it.
Our code also tells Haskell that Cat
is the name of the value constructor, as well as the name of the type. So how does Haskell know when we’re talking about the type, and when we’re talking about the value? Well, by looking at the places we’re using it, it can use inference to work it out.
Notice the Cat
type is defined as one single Cat
data constructor, which has variable “slots” for the name, breed and age. This actually tells Haskell to make the value constructor function Cat :: Name -> CatBreed -> Age -> Cat
and this can be used to make Cat
values (which is why it’s called a data constructor), as well as pattern match for these values. Remember how (:)
is a value constructor for the list type? Well, it’s the same thing at work here.
Next we’ll see a product type for House:
type HouseNumber = Int
data House = House HouseNumber Cat
And a function to work out how old a cat is in human years:
-- this is a commonly agreed upon
-- way to work out cat ages
humanAge :: Cat -> Age
humanAge (Cat _ _ catAge)
| catAge <= 0 = 0
| catAge == 1 = 15
| catAge == 2 = 25
| otherwise = 25 + (catAge - 2) * 4
We’re using a guard pattern to match all the possibilities here. Maybe you can remember that the (<=)
operator means “is less than or equal to”. There is a similar operator for the other direction, too (>=)
which means “is greater than or equal to”.
Notice that the first argument to humanAge is a Cat
, a single value of the Cat
type, but it’s being kind of “pulled apart” by the pattern match using the value constructor. This takes the first two fields of the Cat
data type and ignores them (by matching them to “_
” which as you might know by now, basically throws them away), and then binds a variable name to the age of the Cat
called catAge
, so the rest of the function can compare and do math with it.
Next we’ll see some data for a street
, which will be a list of houses, and a couple of functions for working with that data:
street :: [House]
street =
[ House 1 (Cat "George" Siamese 10)
, House 2 (Cat "Mr Bigglesworth" Persian 5)
, House 3 (Cat "Mr Tinkles" Birman 1)
, House 4 (Cat "Puddy" Burmese 3)
, House 5 (Cat "Tiger" Bengal 7)
, House 6 (Cat "The Ninja" RussianBlue 12)
, House 7 (Cat "Mr Tinklestein"
NorwegianForest
8)
, House 8 (Cat "Plain Cat" MaineCoon 9)
, House 9 (Cat "Shnooby" Sphynx 7)
, House 10 (Cat "Crazy Ears Sam"
CornishRex
3)
]
getCatFromHouse :: House -> Cat
getCatFromHouse (House _ c) = c
getHumanAgeOfCatFromHouse :: House -> Age
getHumanAgeOfCatFromHouse =
humanAge . getCatFromHouse
So, street
is a value whose type is [House]
. The type of House
says it has a single data constructor, also called House
which takes two fields (House :: HouseNumber -> Cat -> House
) and returns a House
value. Each of these houses has an embedded Cat
value in it constructed with the Cat
data constructor which is building a Cat
out of a Name
, its CatBreed
, and its Age
.
Moving on from there, we know that every House
must have a Cat
in it if we look at the House
type, so we see there’s a simple function that extracts the Cat
value from a House
value by using pattern matching (called getCatFromHouse
).
The next function (getHumanAgeOfCatFromHouse
) probably looks a little curious, because we’ve got a new operator in it named (.)
which kind of glues two functions together. It turns them into a single function that does the same as calling the first one on the result of calling the second one. It’s called the function composition operator. We’ll talk more about this function later, but you can think of it as feeding the “output” of the function on the right into the “input” of the function on the left.
The type of getHumanAgeOfCatFromHouse
is annotated to be House -> Age
. As we just saw, getCatFromHouse
takes a House
and gives a Cat
, and humanAge
takes a Cat
and gives an Age
. So, if we somehow could chain or pipe them together (glue them at the Cat
, so to speak!), then giving this new function a House
value would give us back an Age
. This is exactly what the (.)
operator does to two functions: it chains them together to form a new one that does the work of both, as long as they have a common type between them.
Here’s another way to write that same relation by just using regular function application rather than function composition.
getHumanAgeOfCatFromHouse :: House -> Age
getHumanAgeOfCatFromHouse h =
humanAge (getCatFromHouse h)
We can see that we have to include a variable for the House
value here, and we can see that we’re applying the function getCatFromHouse
to the House
, and then applying the function humanAge
to the resultant Cat
.
Sometimes it makes more sense to use normal function application, like the above: humanAge (getCatFromHouse h)
and other times it makes more sense to use function composition like this: humanAge . getCatFromHouse
, but they mean the same thing. We’ll see more of (.)
later.
Now we’ll see a function from the Data.List
module called find
that can be used to get a particular item from a List. A module is a kind of named package of additional functions and types that other programmers have written. So, the Data.List
module contains lots of good things to do with lists, and it needs to be imported in our file at the top. We’re “aliasing” it (or locally renaming it) to L
by using the as
keyword. The qualified
keyword makes sure when it imports all the function names, it doesn’t load them directly into our local namespace. That way, if we already have things named the same thing as the module we’re importing, there won’t be any conflicts.
-- don't forget, this goes
-- at the top of the file
import qualified Data.List as L
Let’s look at the type signature for the find
function, noticing that we’ve qualified the function as being in the L
aliased namespace by putting L.find
instead of just find
:
import qualified Data.List as L
L.find :: Foldable t => (a -> Bool) -> t a -> Maybe a
We can see this function takes two arguments; firstly a function of type a -> Bool
, and secondly a value of type Foldable t => t a
. This function then returns another wrapper type. This one is called Maybe
, and it is wrapping the same type that our Foldable t => t
is wrappering.
What does this function do, though? Well, it takes that function of a -> Bool
and applies it to each of the items in the Foldable t => t a
. If it can’t find any that return True
, it returns the Nothing
value, which comes from the Maybe a
type. If it does find any that return True
, it returns the first one, wrapped in the Just
value constructor from the Maybe a
type.
Here’s an example:
names = ["Harry", "Larry", "Barry"]
result1 = L.find isHarry names
where isHarry name = name == "Harry"
-- result1 will be:
-- Just "Harry"
result2 = L.find isJake names
where isJake name = name == "Jake"
-- result2 will be:
-- Nothing
So we can easily see when it finds a value because True
was returned by the finding function (also called the predicate, remember?), it returns it wrapped it in Just
, and when it doesn’t, it returns Nothing
.
Let’s put this to work on a function that can find the oldest cat in a list of houses:
findOldestCat :: [House] -> Maybe Cat
findOldestCat [] = Nothing
findOldestCat houses = Just oldestCat
where
oldestCat
= getCatFromHouse houseWithOldestCat
houseWithOldestCat
= head housesSortedByCatAge
housesSortedByCatAge
= L.sortBy catAgeComparer houses
catAgeComparer (House _ (Cat _ _ age1))
(House _ (Cat _ _ age2))
= compare age2 age1
This function might look insane at first, but it’s just layers on layers, and we’re going to slowly pull the layers apart together. It’ll be fun, let’s start.
Let’s go from the bottom up. catAgeComparer
is a function that takes two houses and compares the ages of the cats contained within. It does this by pattern matching the ages out of the cats, and the cats out of each house all at once (its type is House -> House -> Ordering
).
An Ordering
is a built-in sum data type which has values of LT
, EQ
and GT
which stand for less than, equal to and greater than respectively. Ordering
values are used by sorting functions in Haskell.
The sortBy :: (a -> a -> Ordering) -> [a] -> [a]
function from Data.List
takes a function whose type is (a -> a -> Ordering)
to sort its second argument: a list. That fits the type of catAgeComparer
, which is why we’re using sortBy
in our definition of housesSortedByCatAge
in the line above that. Put another way, the sortBy
function takes a comparing function (that is, one that returns an Ordering
), and a list, and returns that list sorted by using the comparing function on adjacent elements.
Because we want to sort oldest to youngest, our application of the compare function in catAgeComparer
has age2
first. If age1
was first, it’d sort youngest to oldest.
Next, we have houseWithOldestCat
which takes the housesSortedByCatAge
value obtained by sorting the houses, and picks off the first one with the head
function. It’s only safe to use the head
function when we can be absolutely sure there is at least one item in the list we’re applying it to. Otherwise it will cause your program to crash (crashing is a term that means the program unexpectedly stopped working). We’re sure that there is at least one item in the list we’re applying head
to because we have a clause that matches on the empty list and returns Nothing
.
Finally, oldestCat
is obtained by applying getCatFromHouse
to houseWithOldestCat
, which will obviously just get the cat out of the house.
Now we can talk some more about the Maybe Cat
type that you can see in the findOldestCat
function’s type signature.
Maybe is a type that takes another type to make types with (this is called a type constructor). That means you can have values of type Maybe Int
, Maybe Cat
, Maybe House
, or Maybe
any other concrete type you like.
It’s a sort of wrapper for other types, and it’s what we use in Haskell when we want to make a type’s value optional.
In our case, we can’t be 100% sure if the list we pass to findOldestCat
will contain something. If it is empty, then we obviously can’t pass a Cat
back at all. The way Maybe
values work is there’s a value called Nothing
which represents “no value” of the wrapped type, and there’s a value called “Just a
” which represents any other value of our wrapped type. Take a look at the following values and their type signatures:
Just 5 :: Num a => Maybe a
Just "Heya" :: Maybe String
Just (Cat "YOOBEY" Sphynx 8) :: Maybe Cat
Nothing :: Maybe Cat
Nothing :: Maybe Integer
Nothing :: Maybe a
You might be surprised to know that Nothing :: Maybe Cat
can’t be compared to Nothing :: Maybe Integer
. This is because Cat
is a different type than Integer
and you can’t compare differently typed values (unless they’re polymorphic values — that is, values whose types have type variables like Nothing :: Maybe a
, or 5 :: Num a => a
).
So, Nothing :: Maybe Cat
means “there are no cats”, and Just (Cat "YOOBEY" Sphynx 8)
means “a value of the optional-Cat
type that has a Cat
in it). This is different than the values whose type is Cat
, because a value of the type Cat
must be a Cat
as defined by the type - it can’t be empty at all.
This is a very nice property for a programming language to have. If there is a value whose type is Integer
, you can be sure it won’t have anything other than exactly that in it, which makes reading and reasoning about programs much much easier.
However, every positive side has a negative side, too, and the negative side of this is that it makes working with empty things slightly more complicated than if there were just a general value that means an empty thing. We definitely think the complexity is worth it, though, because these optional types are some of the biggest sources of errors in programming languages that don’t have this feature of “typed optionality”.
We’ve added a main
function and two additional helper functions:
import qualified Data.List as L
data Cat = Cat Name CatBreed Age
type Name = String
data CatBreed =
Siamese | Persian | Bengal | Sphynx
| Burmese | Birman | RussianBlue
| NorwegianForest | CornishRex | MaineCoon
type Age = Integer
data House = House HouseNumber Cat
type HouseNumber = Int
street :: [House]
street =
[ House 1 (Cat "George" Siamese 10)
, House 2 (Cat "Mr Bigglesworth" Persian 5)
, House 3 (Cat "Mr Tinkles" Birman 1)
, House 4 (Cat "Puddy" Burmese 3)
, House 5 (Cat "Tiger" Bengal 7)
, House 6 (Cat "The Ninja" RussianBlue 12)
, House 7 (Cat "Mr Tinklestein"
NorwegianForest
8)
, House 8 (Cat "Plain Cat" MaineCoon 9)
, House 9 (Cat "Shnooby" Sphynx 7)
, House 10 (Cat "Crazy Ears Sam"
CornishRex
3)
]
humanAge :: Cat -> Age
humanAge (Cat _ _ catAge)
| catAge <= 0 = 0
| catAge == 1 = 15
| catAge == 2 = 25
| otherwise = 25 + (catAge - 2) * 4
getCatFromHouse :: House -> Cat
getCatFromHouse (House _ c) = c
getHumanAgeOfCatFromHouse :: House -> Age
getHumanAgeOfCatFromHouse =
humanAge . getCatFromHouse
findOldestCat :: [House] -> Maybe Cat
findOldestCat [] = Nothing
findOldestCat houses = maybeOldestCat
where
maybeOldestCat
= case findOldestCatHouse houses of
Just house ->
Just (getCatFromHouse house)
Nothing ->
Nothing
findOldestCatHouse :: [House] -> Maybe House
findOldestCatHouse houses =
if length housesSortedByCatAge > 0
then Just (head housesSortedByCatAge)
else Nothing
where housesSortedByCatAge
= L.sortBy catAgeComparer houses
catAgeComparer (House _ (Cat _ _ age1))
(House _ (Cat _ _ age2))
= compare age2 age1
getCatName :: Cat -> String
getCatName (Cat name _ _) = name
getHouseNumber :: House -> String
getHouseNumber (House number _) = show number
main :: IO ()
main = putStrLn oldest
where
oldest =
case findOldestCatHouse street of
Nothing ->
"There is no oldest cat!"
Just house ->
"The oldest cat is "
++ getCatName (getCatFromHouse house)
++ ", is "
++ show (getHumanAgeOfCatFromHouse house)
++ " equivalent human years old"
++ " and it lives in Number "
++ getHouseNumber house
Phew! We’ve covered a lot of code in this chapter.
The two helper functions getCatName
and getHouseNumber
use pattern matching to grab out the name of a cat and the number of a house (and make sure they’re a string by using the show
function).
The main
function uses a definition called oldest
which uses a case
expression on the application of findOldestCatHouse
to street
, which returns a value of type Maybe House
. The case
expression is used to match on either the Nothing
value to report an error, or the Just
value, in which case it takes the house out of the Just
value constructor with pattern matching and creates a nice descriptive sentence about the oldest cat, its age in human equivalent years, and where it lives.
Your homework is to write a definition for a name and value whose type is String
, and a program that prints that String
using putStrLn
. Make sure you can do it from memory without looking it up, and write out the types for all definitions in your program.
If you’ve enjoyed reading this, please consider purchasing a copy at Leanpub today
Please follow us, and check out our videos Follow @HappyLearnTutes
Also, Volume 2 is now in beta and being written! Show your support and register your interest at its Leanpub site.