Posted on November 27, 2014
by Stuart Popejoy.
Literate
source

*Author’s Note: this article is the first of a series introducing Haskell to experienced programmers, building up language fundamentals, and unlearning assumptions from the imperative world.*

In this article, we’ll take a close look at functions, namely how we call them – function *application* – and how we define them.

Function application seems like a paltry subject. Indeed, in imperative languages it’s pretty simple. C++ can get hairy regarding copy-construction on return values, Java manages to confuse some folks regarding pass-by-value, but in general, it’s `foo(bar,baz)`

and that’s it.

In Haskell, it gets considerably more interesting. This article will try to illustrate some of the surprising things that arise from functions being the fundamental unit of computation. We’ll start in the imperative world, in an attempt to motivate our discussion of why Haskell does things the way it does.

Consider the following two statements in Java or C:

```
int x = 1;
int y() { return 1; }
```

`x`

is a variable assigned to 1, `y()`

is a *nullary function* (a function with no arguments) that returns 1. From a usage standpoint, they’re identical.

```
x + 4 == y() + 4 == 5; // addition
int[] a = {1,2,3};
a[x] == a[y()] == 2; // list-index
x == y() == 1; // equality
```

`y()`

cannot be redefined in runtime, but `x`

can be *re-assigned*. If somewhere before or during the comparisons above, we write

`x = 3;`

then the comparisons will return `false`

(`0`

in C), or worse (the array index will be particularly ugly). Re-assignment means behavior is dictated by the order in which these statements are written, hence the term “imperative”.

Using functions instead of variables offers some interesting advantages:

They can be declared in any (legal) order without affecting execution –

*where*the function is defined has no impact on its evaluation.They are un-assignable and therefore

*immutable.*They

*encapsulate*the value representation. We can change the implementation without breaking client code.

Of course, this glosses over some details. For one, we’d want our nullary functions to be **pure**, that is, avoid side-effects such as IO, system calls, or reading and writing global variables or class fields. Otherwise we can’t really call them immutable.

Another concern is potential performance issues arising from running functions instead of addressing variables. We will simply note that there are positive implications that come from the pure functional approach: for one example, pure code allows for *automatic memoization* by the runtime, since a pure function always behaves the same given the same inputs.

Regardless, even if we want to go all-functions in our imperative code, we’re going to hit some roadblocks. First, we’d *lose local scope entirely,* since C-family languages lack any facility for declaring functions in local scope. Even functional-friendly imperative languages have similar restrictions: for example, Common LISP reserves `defun`

for the top level only. But there’s a more pressing concern.

In the code below, we call `add5`

with `y()`

. What happens?

```
int y() { return 2; }
int add5(int i) { return i + 5; }
void callAFunction () {
add5 ( y() );
}
```

When the call to `add5`

is executed, the function `y()`

is *evaluated eagerly* and assigned to the argument variable `i`

. Thus within the body of `add5`

, our nullary function is long gone. We’re back to variables and assignment as soon as we invoke a function.

If we wanted to preserve our immutable, pure semantics, we’d need the compiler to “weave in” the actual definition of `y()`

into the code of `add5`

. In other words, when we call functions, we want to *substitute* our function invocation wherever the argument is referenced.

This substitution has an interesting consequence, which is that *immediate evaluation of y() is no longer necessary* to invoke

`add5()`

. Substitution is really just building up a big computation, whose evaluation can be deferred until the program needs the results. This is called Lazy evaluation can be tricky to reason about, but it offers interesting advantages, in that un-needed code branches don’t have to be optimized away by the compiler: they are simply never executed.

So we have to throw in the towel on functionalizing our imperative languages. But we’ve learned what we want to see in another language to get us there:

We want to avoid referencing variables, preferring functions for immutability, purity and encapsulation.

We want to define functions easily in any scope.

We want argument substitution instead of eager evaluation.

In Haskell, we can write

`x = 1`

and we end up with something altogether different than the options we had in C or Java. The syntax looks like variable assignment – `int x = 1;`

– but in fact we’ve defined something akin to our nullary function above – `int x() { return 1; }`

.

In truth, it’s neither: this is a definition that binds the symbol `x`

to a constant value `1`

. Formally, it’s a *closed expression*, not a function. Nonetheless, it gives us everything we were looking for with our nullary functions above:

we can use it anywhere we like, independent of where/when we declared it

We can’t reassign or redefine it; it’s a permanent, immutable definition

We can change how we arrive at its value, e.g.

`x = 2 + 3 - 4`

, without impacting client code.

Informally, we can think of this as a nullary function, and for imperative programmers, this can be helpful, to “unlearn” our instincts to see an expression like `x = 1`

as variable assignment. As we’ll see below, “real” functions with arguments are defined using the same syntax, so it’s not terribly inaccurate to just see this as a definition of a nullary function.

When you see `=`

, think “I’m defining a function” instead of “I’m assigning a variable”.

Haskell is strongly-typed. So what type is `x`

above?

A debate has raged pitting typed, compiled languages like C versus dynamic, untyped languages like Javascript, with the claim that types slow down development and get in the way. Haskell does an end-run around this whole debate by allowing us to have our types, but only specify them when we have to. It does this with *type inference,* meaning that in many cases, the compiler can figure it the type by itself.

Let’s use the Haskell interpreter to see what type `x`

is. If you have ghci, the Haskell REPL, up and running, you can issue `let x = 1`

to follow along. Alternately, you can download the source of this article, which is “literate haskell” and therefore can be loaded up in your interpreter with the “:load” command, and just reference `x`

directly.

We ask ghci using the `:t`

command:

```
ghci> :t x
x :: Num a => a
```

Huh? That’s more complicated than `int y()`

! We’ll defer a deep-dive on Haskell’s type system, and summarize that this means `x`

’s type involves `Num`

-eric types. Unlike many languages, Haskell doesn’t assume the literal `1`

always indicates an integer type!

To keep things simple, and because it’s generally good practice, we’ll provide a **type signature** for our top-level definitions:

```
y :: Int
y = 2
```

Here, we’ve defined `y`

to return the constant value `2`

, and given it a type signature that fixes the type to `Int`

. When we ask ghci, we get a much simpler answer:

```
ghci> :t y
y :: Int
```

When the time comes to use these expressions, Haskell offers a much cleaner syntax, with no parentheses required. Below are the Haskell versions of our imperative code above:

```
someTests = (
(x + 3) == 4, -- addition
([1,2,3] !! x) == 2, -- list-index
x /= y -- (in)equality
)
```

Note: this example is executable Haskell code. To do so, I’ve encased the examples in the dummy function `someTests`

. I didn’t bother to provide a type signature for it, as the type is not relevant. The reader can figure out the type as an exercise, or ask ghci.

We talked about how nice it would be to declare functions in local scope. In Haskell we have all the same tools for local definition as we do at top level:

```
localWithWhere = y * 30 + y ^ 2
where y = 3
```

Here, we’ve defined `y`

to be used twice in the outer function body. The `where`

keyword starts a block of code for definitions in local scope.

Note that this article’s code has already defined of `y`

at the top level, above. Haskell allows definitions with the same name in local scope, which will override the top-level definition. This is called *shadowing* the top-level definition.

You can also use “let” and “in”:

```
localWithLet = let y = 3
in y * 30 + y ^ 2
```

This is more familiar to the imperative idiom, in that something that looks like a variable is defined before the consuming code. I recommend imperative programmers stick with `where`

, since it looks the least like imperative assignment. Otherwise, it’s a matter of style: `where`

allows you to put the high-level functionality first, followed by details; `let`

/`in`

is maybe clearer in some situations.

Local definitions can have type signatures, too:

```
localWithTypeDecl :: Int
localWithTypeDecl = y * 30 + y ^ 2
where y :: Int
y = 3
```

Here we start to see some real benefits of type inference. At top level, leaving off type signatures will generally make your code *harder* to understand, but at local level, type inference makes your code *easier* to read. The type signature for `y`

above is simply verbose: it’s clear from the top-level type that we’re dealing with `Int`

values.

We’ve happily dispensed with variable assignment, with Haskell providing us a clean and powerful syntax for defining expressions. It’s now time to look at “real” functions, those things that have arguments, and see how we define and invoke them.

In a pure language, a unary (one-argument) function directly maps input to output. Let’s define one:

```
unaryFunction :: Int -> Int
unaryFunction x = x * 3 + 2
```

The type signature is `Int -> Int`

, meaning the function takes an `Int`

argument and returns an `Int`

. Thinking of it as mapping input to output, we can imagine the argument “indexing” the output. Indeed `Int -> Int`

even looks like mapping syntax from languages like Perl. The arrow dividing the argument from the result type is the *function type constructor*, which we’ll be seeing a lot.

A binary (two-argument) function can also be thought of as a mapping or indexing operation, but in two dimensions. Here’s a binary function:

```
binaryFunction :: Int -> Int -> Int
binaryFunction x y = x * y + x
```

This is when the type signatures stop looking like anything you’ve seen in an imperative language. The type signature is growing to the left, with arrows dividing the arguments *and* delimiting the argument from the result type. Here’s an example with different types to make the positions clearer:

```
lengthIsDivisible :: String -> Int -> Bool
lengthIsDivisible s i = (mod (length s) i) == 0
```

This function takes a `String`

and an `Int`

and returns a `Bool`

, `True`

if the length of the string is divisible by the second argument.

Let’s compare the signatures so far:

```
y :: Int
unaryFunction :: Int -> Int
binaryFunction :: Int -> Int -> Int
```

Note the consistency and simplicity. It’s admirable, beautiful, maybe confusing at first. But it’s no accident that the syntax is unified, as we’ll soon see.

To invoke functions, we simply delimit argument values with whitespace.

```
invokeUnary = unaryFunction 8 -- will compute 8 * 3 + 2
invokeBinary = binaryFunction 2 3 -- will compute 2 * 3 + 2
```

Again Haskell is simple and beautiful: no parentheses, no dots, just functions and values. However, once we start calling functions with other functions, we’ll need some more syntax. Let’s try calling `binaryFunction`

with `unaryFunction 3`

as its second argument:

`binaryFunction 2 unaryFunction 3 <=== bzzt`

This doesn’t work: the compiler can’t distinguish what’s an argument from what’s a function invocation. We’ll need to clear things up with parentheses.

`invokeParens = binaryFunction 2 (unaryFunction 3)`

That’s better. Evaluation looks straightforward: compute `unaryFunction 3`

and pass the results as the second argument for `binaryFunction 2`

, right? Not so fast!

Recall our discussion of *eager evaluation* above, and how we wanted to avoid assigning variables by *substituting the function call in for the argument variable*. What it means here is we want to substitute the entire invocation of `unaryFunction 3`

into `binaryFunction`

, looking something like this:

`binaryFunctionSubstituted = 2 * (unaryFunction 3) + 2`

`x`

becomes 2, and `y`

becomes `(unaryFunction 3)`

. With this substitution, we now have a *new expression*, which Haskell will evaluate lazily when the results become needed.

Parentheses are not the only way to delimit function arguments. Haskell also has `$`

, the *application operator*. Here’s how we can write the code above without parentheses:

`invokeAppOp = binaryFunction 2 $ unaryFunction 3`

This looks nice. Everything after the `$`

is “bunched together” as the second argument to `binaryFunction`

. It has the same meaning as `binaryFunction 2 (unaryFunction 3)`

but with a sweet functional look.

We could leave it at that, and marvel at Haskell’s flexible syntax. But let’s look deeper. A common mistake is to think that `$`

is a keyword, or lexical syntax. In reality it’s nothing more than a library function.

Let’s ask ghci for its type. To do so, GHCI requires us to put parentheses around `$`

:

```
ghci> :t ($)
($) :: (a -> b) -> a -> b
```

This shows it’s a normal function, since you can’t ask for the type of keywords and syntax:

```
ghci> :t (=)
<interactive>:1:2: parse error on input ‘=’
```

`$`

is just a function, but it’s pretty cool that we can drop it in instead of parentheses! Most languages don’t allow you to arbitrarily replace argument delimiters with library functions. So what’s going on? What about the type signature?

`($) :: (a -> b) -> a -> b`

Uh-oh. Don’t worry, we’ll explicate all of this in due time. For now suffice it to say that **($) is a function that takes two arguments: a unary function to be applied, and its first argument**. “`(a -> b)`

” is the first argument, and you can see it looks like a one-argument function; “`-> a`

” is the second argument, and “`-> b`

” is the return type. The lowercase letters indicate the function is *polymorphic*, which we’ll gloss over for now to say that they take any type.

`$`

takes a unary function and it’s argument. Let’s try it on just a unary function:

`invokeUnaryAppOp = unaryFunction $ 2`

Super-boring, function application restated. What’s the point?

To spice it up, we’ll put parentheses around `$`

; in Haskell this “un-infixes” it and turns it into a “normal”, prefix function:

`invokeUnaryAppOpPrefix = ($) unaryFunction 2`

That’s a little more interesting. Now we can see that `$`

is taking a unary function as its first argument, and the value to apply to that function as its second. Still hasn’t shown us anything though, we’ve just used a function to … call a function. Let’s keep digging.

Let’s use that prefix-notation trick to rewrite our example above:

`invokeAppPrefix = ($) (binaryFunction 2) (unaryFunction 3)`

We have to resort to parentheses again to keep GHC happy. What do we have? `(unaryFunction 3)`

is straightforward. But … something funny is going on with `(binaryFunction 2)`

. The parentheses make it clear that instead of taking two arguments, it only has one! What kind of binary function only takes a single argument??

This is our first example of **partial application**, a core concept in functional programming. The idea is you can “partially invoke” a function with just one of its arguments, and in so doing you create *another* function that takes the remaining argument. To illustrate, let’s define integer addition as a function:

```
intAdd :: Int -> Int -> Int
intAdd x y = x + y
```

We can invoke this the usual way, `intAdd 2 3`

will evaluate to `5`

. If we only provide the first argument, though, we’ll create a new, unary function:

`intAddPartial = intAdd 2`

This compiles, and as promised, `intAddPartial`

is a unary function that always adds `2`

to its value. Let’s play with it in GHCI:

```
ghci> :t intAddPartial
intAddPartial :: Int -> Int
ghci> intAddPartial 3
5
```

Partial application shows up in many functional-ish languages, but it’s often presented as “something special”, distinct from everyday function use. For instance, in Clojure, we have to use the `partial`

function to do so, which is similar to other LISPy languages.

In Haskell, partial application is fundamental. When we said there was more to those elegant type signatures than good looks, this is what we were getting at: the unity of the type signatures matches the unity of partial- and full-application. Let’s compare `intAdd`

and `intAddPartial`

’s signatures:

```
intAdd :: Int -> Int -> Int
intAddPartial :: Int -> Int
```

We can visually see that first argument being “bundled” into the partially-applied version. In Haskell, partial application is as simple as leaving off an argument, resulting in a new function that takes one less argument. Let’s ask GHCI:

```
ghci> :t intAdd 2
intAdd 2 :: Int -> Int
```

Partial application operates hand-in-hand with substitution. Recall above how supplying `unaryFunction 2`

to another function, created a new function with `unaryFunction 2`

stitched into it. The idea here is that **“full” application is no different from partial application**:

```
ghci> :t intAdd 2 3
intAdd 2 3 :: Int
```

“intAdd 2 3” *fully applies* `intAdd`

to derive a *lazy computation* that returns 5. Full application is useful in a lazy context, in that you can “load up” a function with all of its arguments and pass it into another function to be evaluated later, for instance as an “event” ready to dispatch.

Let’s return to our investigation of the application operator `$`

above. Our prefix rewrite resulted in `($) (binaryFunction 2) (unaryFunction 3)`

. Hopefully it is now clear what `binaryFunction 2`

by itself means: `binaryFunction`

partially-applied to 2, creating a new function:

```
ghci> :t binaryFunction 2
binaryFunction 2 :: Int -> Int
```

Finally, we’ve unpacked how `$`

does its magic: it simply relies on normal function application. `binaryFunction 2 $ unaryFunction 3`

can be written entirely with parentheses to make this clear:

`invokePartialParens = (binaryFunction 2) (unaryFunction 3)`

And indeed, the definition of `($)`

is nothing but function application itself:

```
($) :: (a -> b) -> a -> b
f $ x = f x
```

There you have it: `f $ x`

just calls `f x`

. Further up the source file, we also see

`infixr 0 $`

which establishes `$`

as a right-infix function with precedence 0. It’s trivially easy in Haskell to declare new “operators” this way, that is, infix binary functions with “swear word” characters like `$@?`

etc. As a result, very little of Haskell is defined as syntax or keywords, but instead is in the library, with source you can look up.

Recall our use of `($)`

on a unary function above:

`($) unaryFunction 3`

There’s something curious here: `unaryFunction`

*looks* like it’s being applied to `3`

, but instead it’s an argument to the `($)`

function. Thus, `unaryFunction`

is acting purely as a **value**, that is, the value of the first argument to `($)`

.

In Haskell, functions are the main attraction, so this isn’t too shocking. There are some surprising implications though. Since functions are values, then we can use a function “as such” in other definitions. We can “alias” a function:

```
ufAlias :: Int -> Int
ufAlias = unaryFunction
```

No problem: we’re assigning `ufAlias`

to the value `unaryFunction`

, which happens to be of type `Int -> Int`

. We can now use it just like `unaryFunction`

:

```
ghci> ufAlias 1
5
```

This seems simple enough, until you notice *we’ve just defined a unary function without ever mentioning the argument!* In most languages, this is impossible: you have to restate your argument definitions over and over. Here’s some equivalent C code:

```
public int add1(int x) { return x + 1; }
public int aliasAdd1(int x) { return add1(x); }
```

Of course, nothing is stopping us from restating the arguments:

```
ufAliasWithArgs :: Int -> Int
ufAliasWithArgs x = unaryFunction x
```

Stare at `ufAlias`

above and `ufAliasWithArgs`

for a second. They do the same thing, but `ufAlias`

accomplishes it with simple assignment. It makes perfect sense if you think of functions as values.

The ability to define functions without referencing the arguments is a *major* source of confusion for new Haskell developers. Very few languages do this: we’re accustomed to understanding a function by looking at the argument list. Taking the arguments away can be very unsettling.

Hopefully the simple example above makes clear that we’re not “hiding” arguments when we say `ufAlias = unaryFunction`

, but simply working with `unaryFunction`

as a value. Armed with this insight, we can tackle the beast known as “point-free” style, or even more intimidatingly, “eta reduction”, where arguments vanish from the left- and right-side of a function definition.

Let’s take another example from our discussion further up: our partial application `(binaryFunction 2)`

, which created a new function with a single argument. Well, that new function is just another value. Let’s do the same trick with it:

```
bfPartial :: Int -> Int
bfPartial = binaryFunction 2
```

Make sense? We could have written `bfPartial y = binaryFunction 2 y`

, but why bother: the value `binaryFunction 2`

is usable all by itself, so our definition of `bfPartial`

will reference that value alone.

Point-free code is often presented as a stylistic exercise to look like a functional wizard. I’m trying to show here is that it is instead a fundamental consequence of working with functions as values. The point is not to “hide” arguments, but to ramp up the expressiveness of the language to fully exploit the power of functions.

Haskell is a *pure* language, meaning that a computation’s output is always determined by its inputs. This implies that no funny business can go on inside of a function body or definition: we can’t do IO, assign fields or global variables, or make system calls.

As a consequence of purity, Haskell does not *eagerly evaluate* function arguments, but instead *substitutes* the code used to make up the argument into the body of the target function. Substitution means that in Haskell, *applying a function to an argument creates a new function*.

Function application happens one argument at a time: each argument when applied creates a new function which is then applied to the next argument. A function applied to less arguments than specified in its definition is *partially applied*, resulting in a new function with however many arguments are left.

Because Haskell is a *lazy* language, there is no difference between a fully- or partially-applied function: a fully-applied function is not necessarily evaluated at the site where the arguments were applied, meaning you can use a fully-applied function as a event-like nullary function to be “called” elsewhere.

Functions are defined with the equals operator, and specified using type signatures. The arrow operator `->`

delimits argument types and the result type. Functions can be defined and specified at top-level, or locally with `where`

or `let`

/`in`

.

While Haskell is strongly-typed, *type-inference* allows us to elide the type specification where convenient. Idiomatically, we provide type signatures for top-level functions, and leave off type signatures for local functions.

When applying functions, we delimit argument values with whitespace, and delimit inner function application with parentheses. We can also use the library function `$`

in place of parentheses where convenient.

Since functions are values in Haskell, we can work with them “as such”, meaning we can assign other names to them. As a result, we do not have to restate the arguments, and can elide them from our function definitions, a process called “eta reduction”, or “point-free style”.