Poor Man's Time Machine: Lazy Evaluation in JavaScript and Haskell
Here is a simpler version of the famous repmin puzzle from Richard Bird's arsenal of functional tricks: Given a
non-empty array of numbers a, you have to find the smallest number m and replace every element of a with m
in a single traversal.
Naive Non-Solution
It is straightforward to build a naive non-solution which calculates the minimum value of a in one pass and then
mutates a in another pass:
function findMin(a) {
let m = a[0]
for (let i = 0; i < a.length; i++) {
if (a[i] < m) {
m = a[i]
}
}
return m
}
function replaceMin(a, m) {
for (let i = 0; i < a.length; i++) {
a[i] = m
}
}
m = findMin(a) // Pass #1
replaceMin(a, m) // Pass #2
The question is: how do we perform these temporally dependent tasks in a single pass? That is, how do we go through
a to replace all its elements with m but somehow also calculate m at the same time?
Basically, we need to traverse a to calculate m and simultaneously replace it with some value x which we can
guarantee is going to be the minimum value?
But wait, won't that require sending a message to the present from the future?
So, is it necessary to invent a time-machine to solve this puzzle? Or could we do with something more modest: say, a simple function and a tiny leap of faith.
Unevaluated Value is Function
To see how a simple function can help us, please note that the seemingly temporal dependency arises only because
replaceMin(a, findMin (a)) forces the traversal in findMin before replacement in replaceMin.
If we could just find a way to represent the minimum value not as a Number in JavaScript but a function instead, the
temporal dependency would vanish because a function doesn't need to be evaluated immediately. We can just replace
every element of a with a function which would evaluate to the minimum value later in the future. This function is
then a proxy for m which delays the need to actually calculate m when we enter replaceMin.
Let's do this with a function findAndReplaceMin which calculates m but also replaces the elements of a by some
as-of-yet unrelated function xf in a single pass:
function findAndReplaceMin(a, xf) {
let m = a[0]
for (let i = 0; i < a.length; i++) {
if (a[i] < m) {
m = a[i]
}
a[i] = xf
}
return m
}
Note that xf is a function which is passed to findAndReplaceMin which replaces every element of the list a by
xf. We still need to choose xf so that every occurrence of xf eventually yields the minimum value, even though
that minimum has not yet been computed.
This seems like a problem. We're still stuck in a situation where evaluation of m requires xf be passed to
findAndReplaceMin but findAndReplaceMin itself needs xf to be passed before m has evaluated.
Make Evidence for Faith
This seems like real problem until we realise that we can bootstrap the evidence for our faith by saying this:
let m = findAndReplaceMin(a, () => m)
This seems paradoxical: how can m be used in its own definition?
The answer is subtle: we are NOT using m to calculate m but only denoting m in a function which calculates m
independently of the denotation.
More precisely, the closure () => m captures the undefined variable m without forcing the evaluation which would
happen only when we say xf(). And if you look carefully, we take care not to do that anywhere inside
findAndReplaceMin (more on this in a bit).
When the line let m = findAndReplaceMin(a, () => m) is executed in code, it would do two separate tasks in one
single pass:
- Calculate the minimum value by traversing
aand bind it tom - Replace every element of
awith the function() => m(without evaluating the function)
Contrast the line above with the obviously invalid declaration: let m = m which immediately attempts to read the value
of m before it has been initialized.
By comparison, let m = findAndReplaceMin(a, () => m) never reads m during the construction of the closure. The
expression () => m merely records how to obtain m in the future. The actual lookup is postponed until the function
is called.
Note that after findAndReplaceMin has run, a is populated by functions and not Numbers. So, when any element of
a is needed, we get it by calling a[i]() which would try to access the value bound to m. But note that that
doesn't need to run findAndReplaceMin again because m is already bound to the minimum value from Step 1.
In functional parlance, writing this self-referential declaration is close to what is called "tying the knot". It
refers to the circular reference to m, which we fill from findAndReplaceMin in the future, but which we can still
use inside findAndReplaceMin in the present if we care not to inspect or evaluate it.
The key to this illusion of time-travel is the assignment a[i] = xf without inspecting xf. We can not afford to poke
xf (say by doing xf()) inside findAndReplaceMin because that would just force the evaluation of m ... inside the
evaluation of m ... I think you get where that would lead.
I should reiterate that no law of physics is being broken here and no information is actually travelling backwards in
time. We instinctively think of computation as a sequence of steps: first compute the minimum, then replace the
elements. But tying the knot invites us to think equationally instead. Rather than saying "let me calculate m first",
we say "here is how i'd like m to be defined given findAndReplaceMin" but with the added benefit that the definition
itself can be mutually recursive if we build it carefully.
We've almost solved this puzzle, but you might now complain: we were asked to create an array of Numbers but ended up
creating an array of functions instead.
And the answer is that with standard primitives, we can't solve this perfectly in JavaScript without leaving behind an array of functions. (We could maybe pull off a solution using some advanced object metaprogramming, but that's probably a topic for another blog post!)
The Real Lazy Evaluation
To check how we can do exactly what we sought out to, we look at a language where delayed evaluation is built-in: Haskell
Let's see how findAndReplaceMin would look like in Haskell:
findAndReplaceMin :: [Int] -> Int -> (Int, [Int])
findAndReplaceMin [y] x = (y, [x])
findAndReplaceMin (h : t) x = (min g h, x : t')
where (g, t') = findAndReplaceMin t x
let (m, b) = findAndReplaceMin a m
It almost mirrors the JavaScript logic, but recursively and immutably.
After findAndReplaceMin has run, the list contains Ints in Haskell. But the same is not possible in JavaScript. Why?
The main reason is that in Haskell, we do not need to rely on closures to delay evaluation because functions are lazy by default.
What that means in practice is that Haskell does not evaluate expressions just because they are passed as arguments or
stored inside data structures. Instead, expressions are represented as unevaluated values (called "thunks") and
evaluated only when their values are demanded in some calculation. This evaluation strategy is known as call-by-need
evaluation. An expression like findAndReplaceMin t x can be easily written without worrying about x being
evaluated before findAndReplaceMin has finished. So, we don't need to resort to tricks like () => m just to delay
evaluation of m. The expression bound to m remains unevaluated until some operation (mostly pattern-matching, or
some function which needs m, like +, or *) actually demands its value.
Compare this to JavaScript, where if write findAndReplace(a, m), both a and m would be evaluated before being
passed to the function, which is why we need to "wrap" m inside () => ... to prevent JavaScript from poking. This
style of evaluation is called call-by-value evaluation and is common in languages like C, Python, and JavaScript.
Also, tying the knot is relatively easier in Haskell. In Haskell, we can just write m on both sides of let (m, b) = findAndReplaceMin a m. A similar attempt in JavaScript is bound to fail, which is also why we took care to hide m
inside () => m. But Haskell can construct cyclic graphs of thunks. The self-reference in let (m, b) = findAndReplaceMin a m does not require the runtime to know the value of m immediately. Instead, it creates a network
of dependencies in which m and b can mutually refer to each other. Laziness ensures that these references are only
followed when their values are actually needed.
Don't Peek Too Soon!
So it turns out we didn't need a time machine after all: just a function, a carefully deferred lookup, and the
discipline not to peek too soon. But more importantly, in both languages, the trick explicitly relies on the same leap
of faith: while the lazy evaluation (Haskell) and wrapper lambda (JavaScript) prevent automatic evaluation by runtime,
we ourselves can not afford to peek into them (by pattern-matching m in Haskell or calling xf in JavaScript) before
findMinAndReplace finishes execution.
If we get too impatient and lose faith, exactly the same fate awaits us in both Haskell and JavaScript: an infinite abyss of doubt till time ends!