Please Buy now at Leanpub
Written and illustrated by GetContented.
Published on 2016-08-02.
Let’s say we have a movie collection, and for some reason, we're only interested in movies whose first letter is in the first half of the alphabet. In our program, we'll call those good movies, and the others bad.
movies = [ "Aeon Flux" , "The Black Cat" , "Superman" , "Stick It" , "The Matrix Revolutions" , "The Raven" , "Inception" , "Looper" , "Hoodwinked" , "Tell-Tale" ]
We'd like to be able to make a new list of our movies with “good” or “bad” appended to the name so we know which we can watch.
What we need is a function that decides if a movie is good:
isGood :: String -> Bool isGood (x:_) = x <= 'M' isGood _ = False
This kind of function, one that returns a Bool value that is depending on another value, is called a predicate function. It’s yet another fancy word which comes to us from the subject of Logic. It simply means something can be either affirmed or denied about something else. That’s exactly what’s going on here. We’re finding out if something is good or bad, by our arbitrary distinction, and returning a
Bool value depending on that.
The first line takes any
String of at least one element, matches x to the first element, then throws the rest away, and uses the
(<=) :: Ord a => a -> a -> Bool operator to check if it’s earlier in the alphabet than
There are some new things here, obviously. First is that we’ve got single quotation marks around M. What is the type of this thing? Well it’s
'M' :: Char. In Haskell, single quotation marks are used to indicate that something is a
As we mentioned before, a
String is simply
[Char]. That means
"Hello" is the exact same thing as
['H', 'e', 'l', 'l', 'o']. So,
String has a special syntax, just like list has a special syntax. You can either write strings as a list of
Char values, or you can write them with double quotation marks around them.
Back to our predicate function, we notice we’re using the
(<=) function. This is called “less than or equal to”. Its type indicates it takes two values of type constrained by the
Ord typeclass. This typeclass is for types whose values can be compared in some ordered way (hence the name).
We’re matching on the
String argument using
(:), grabbing out the head as
x, comparing it to the
‘M’ to see if it’s in the first half of the alphabet, because
A is “less than or equal to”
M. The head of a list is the Haskell name for its first item, and the tail is the name for the remainder.
Ord typeclass provides a type with the
min functions. They provide functionality to do various comparisons of values.
The end result of the above is a function that checks if a movie’s title starts with a letter in the first half of the alphabet.
Now we'll see a function that takes a movie title and gives us a new name for it, depending on its good/bad assessment:
assess :: String -> String assess movie = movie ++ " - " ++ assessment where assessment = if isGood movie then "Good" else "Bad"
In this function, we’re introducing you to a
where clause. This isn’t an expression, so we can’t embed
where clauses anywhere we like, but rather they let us write definitions that are only “in effect” within the level above where they appear. This is called local scoping. That is, the scope of the definition only applies locally to the
This is why in the
assess function above, we can use the definition for
assessment within the function’s body expression.
We can have multiple definitions and even functions written in
where clauses. Let’s see another definition of the assess function that has two definitions in its
assess' :: String -> String assess' movie = movie ++ " - " ++ assessment where assessment = if movieIsGood then "Good" else "Bad" movieIsGood = isGood movie
Here we’ve just pulled
isGood movie out into its own locally scoped definition. We don’t recommend it, but if you wanted to get crazy, you could even put
where clauses within definitions within other
where clause definitions. Usually if you have this level of nesting, you should consider pulling your functions into smaller pieces.
Next we're going to see a partial function so we can explain how it works to you in stages. Here, by partial we mean it's not total, because it’s missing some cases of values of its arguments. That is, the function doesn't cover every possibility. This is not a good thing to do, ordinarily. We're just doing it here to show how we build up a function in steps.
So, this function we want to write should take a list of movies and give us back the newly named movie list...
assessMovies :: [String] -> [String] assessMovies  =  assessMovies [y] = [assess y]
... but it doesn't, because it only works properly on lists of length 0 or 1, and we already know that in general, lists can be any length. The
[y] pattern only matches against lists of one item, naming the item
y as it does, and here we're returning a new list with the newly named movie in it by passing it to our
assess function. We see we can have function applications inside of lists without trouble.
Next, we’ll see that we add a pattern for lists of two items and change the whole thing to use the
(:) operator as a pattern-matcher, because we know
(x:) are the same thing, but also to use the
(:) operator to make the new list rather than how we did above:
assessMovies :: [String] -> [String] assessMovies  =  assessMovies (y:) = assess y :  assessMovies (x:y:) = assess x : assess y : 
This is a little better, but what about lists with more than two items? Also, it’s getting tedious writing so many definitions, and a pattern of repetition is emerging. As we become programmers, we need to learn to see these repeated patterns, because they're usually an indication that something can be done in a better, or more abstract way.
One of the first patterns to notice is our second definition works for 1-item lists, and our third pattern-match has the exact same 1-item list pattern in it as the second definition.
If we could match the
(y:) part as one whole variable name in the third pattern, then somehow replace it with an application of the assessMovies function itself, we might be able to reduce these two definitions down to just one.
Well, you probably guessed that we can do this, and in fact this is exactly what we're going to do. This process of using the function we're defining within that function definition itself is called recursion, and it's one of the foundational underpinnings of Haskell.
So these two duplicated patterns can be resolved with a single definition, as long as we keep our empty list as a base case, otherwise we'd end up in an endless cycle. The empty list is our way out of the recursion. The recursive definition handles everything other than the empty list, which the base case handles.
assessMovies :: [String] -> [String] assessMovies  =  assessMovies (x:xs) = assess x : assessMovies xs
So, if we had the list
["The Matrix"], that's equivalent to
"The Matrix" : , and so matches the second pattern with
"The Matrix" as
xs. The body of the function uses the list construction operator
(:) to join the result of
assess x to the application of
assessMovies with its argument of the empty list, which results in the empty list as the base case, ending the recursion. You should think about how this works across a few different lists of different sizes.
Another way to think of this is if you had to crack some eggs. You could write down a definition for making cracked eggs like this: “crack eggs” is: take a line of eggs. Crack the first egg, then do “crack eggs” on the rest of the eggs until there are none left. See how “crack eggs” is both the definition and used in itself? It's the same idea here, except Haskell creates a new list rather than modifies an existing one. We define building a list of assessed movies as: if the list is empty, give an empty list, otherwise take the first item and make its assessment String, joining that as the head of the list to the evaluation of the rest of the assessed movie titles list.
It's really beautiful, isn't it? You can start to see why having purity can really pay off. We can do this kind of substitution and equational reasoning easily. Why? precisely because we have pure functions, and because Haskell gives us these nice abstract pieces to work with.
This pattern - applying a function (the function
assess here) to each element of a list — is so common that there’s actually a function called
map that extracts this mapping functionality. It takes a mapping function and a list as arguments. Its type is
map :: (a -> b) -> [a] -> [b]. This type says that map is a function of two arguments, the first one of which is a function.
assess function’s type is
String -> String. This fits into the pattern of the first argument of map:
a -> b. In Haskell
a -> b means the types can be different, but don’t have to. It's a function of one type to a second type (which may or may not be the same type). So,
assess :: String -> String fits the first argument to
map. If we supply the function
map with the function
assess as its first argument, that means we’re saying the type
map’s type signature must be
String, and also that the type
b must be
String, too. That means the second argument and the return type must both be
map to get an intuition of how to use it, and build a small full program around it. If any of this is unclear, the next chapter will most likely clear it up as it delves more into functions such as
import qualified Data.List as L movies = [ "Aeon Flux" , "The Black Cat" , "Superman" , "Stick It" , "The Matrix Revolutions" , "The Raven" , "Inception" , "Looper" , "Hoodwinked" , "Tell-Tale" ] isGood :: String -> Bool isGood (x:_) = x <= 'M' isGood _ = False assess :: String -> String assess movie = movie ++ " - " ++ assessment where assessment = if isGood movie then "Good" else "Bad" assessMovies :: [String] -> [String] assessMovies = map assess assessedMovies :: [String] assessedMovies = assessMovies movies main :: IO () main = putStrLn (L.intercalate "\n" assessedMovies)
First we import the
Data.List package so we can use the intercalate function (which takes a joining list of a type, and a list of Lists the same type and builds a fresh list by joining each element of the list together with the joining item between). Here we’re using it with
[[Char]] which is the same as
String is just a type alias for
If we were to define a “new” version of
String, we could do it like this, because
String is simply a type alias to a list of
Char, otherwise known as a type synonym:
-- type sets up a type alias: type NewString = [Char]
This is how we create a type alias (or type synonym). Once we have this in our program, everywhere Haskell sees
NewString in your program, it will interpret it as
[Char]. Note that these are not different types, they're identical, they just have different interchangeable names.
Anyway, after this, back in our main program, we set up the movies expression, and the
assess functions. Then we create the
assessMovies function which takes the list of movies and uses
assess to build a list of assessed movies. Once that is done, we create the
assessedMovies expression that simply applies
assessMovies to the movies list.
main function then simply prints out with the passed in newline (
"\n" is the special newline character in a
String) between each of the
assessedMovies (that’s what
See if you can change the program so that it has a different first letter that decides which movies are bad and good.
After you’ve done that, see if you can change the program so that it has different movie titles in it.
If you’ve enjoyed reading this, please consider purchasing a copy at Leanpub today
Please follow us, and check out our videos Follow @HappyLearnTutes
Also, Volume 2 is now in beta and being written! Show your support and register your interest at its Leanpub site.