The purpose of the lizy lists I implemeneted in my previous post was to build lazy queues. Lazy lists give you constant time concatenation, which can be useful in itself, but I needed it to implement persistent functional queues.

In this post I will use the linked list implementation I made earlier, but I will make a slight change to the lazy lists. It occurred to me, after publishing yesterday’s post, that using the CONS constructor to return pairs from lazy-list thunks was not a good idea. The tail of a linked list should always be another linked list, and if I put a thunk there, it isn’t.

With pmatch you can put type restrictions on constructors, and we can define a linked list where cdr is always another linked list, like this:

linked_list := NIL | CONS(car, cdr : linked_list)

All the linked list code assumes that this is what a list looks like, and it is the assumption I broke in my previous post.

Instead of using CONS to return heads of a linked list, I will add another constructor, LLCONS. This isn’t a lazy list constructor, I want those to always be empty or thunks, but it is what thunks should return.

lazy_list := LLNIL | THUNK(lst)
lazy_list_head := LLCONS(head, tail : lazy_list)

With this type definition, I need to update lazy_cons and the lazy_macro from the previous post, but all the other lazy functions remain the same.

make_lazy_thunk <- function(expr) THUNK(function() expr)
lazy_cons  <- function(car, cdr) make_lazy_thunk(LLCONS(car, cdr))

lazy_macro <- function(empty_pattern, nonempty_pattern, ...) {
  
  empty_pattern    <- substitute(empty_pattern)
  nonempty_pattern <- substitute(nonempty_pattern)
  extra_args       <- rlang::enexprs(...)
  
  cases_pattern <- rlang::expr(
    cases(.list,
          LLNIL -> !!empty_pattern,
          THUNK(.list_thunk) ->
            cases(.list_thunk(),
                  LLCONS(car, cdr) -> !!nonempty_pattern))
  )
  function_expr <- rlang::expr(
    rlang::new_function(
      alist(.list =, !!!extra_args), 
      cases_pattern, 
      env = rlang::caller_env())  
  )
  
  rlang::eval_bare(function_expr)
}
lazy_reverse <- lazy_macro(acc, lazy_reverse(cdr, lazy_cons(car, acc)), acc = LLNIL)
x <- list_to_lazy_list(1:5)
lazy_to_llist(x)
## 1 :: 2 :: 3 :: 4 :: 5 :: []
y <- lazy_reverse(x)
lazy_to_llist(y)
## 5 :: 4 :: 3 :: 2 :: 1 :: []

Lazy queues

Where we left queues, we had a data structure with amortised constant time operations, but only as treated as an ephemeral structure.

By ephemeral data structures, we mean structures where some or all operations on them modifies them. We cannot simply keep copies around in the state they have before an operation.

This is clearly not the case for our functional queues. We do not ever modify them—they are persistent—but the runtime analysis will not work. If we keep copies around in a state where we have saved up for one expensive operation and then execute the expensive operations more than once.

I do not mean an if-else statement here. If we only take one of the two branches—as we would with an if-else statement—then the analysis works. We only have a problem if we keep a copy around, compute using the queue for a while, and then return to the copy and do more computations with it.

If, for example, we have inserted n elements into the queue—so these are all in the back-list—and then branch between two computational paths. In both paths we dequeue an element. This triggers the reversal of the back-list, which costs n operations. We have money in the bank for the first of these, but in the second branch the bank is empty and the reversal will put us in the red.

In a lazy queue, we will keep this invariant: The front-list must always be longer than the back-queue.

We will reverse the back-list every time it would get longer than the front-list. This can happen both in enqueue and dequeue operations.

This is different from the eager list, where we only reverse when we dequeue, but a lazy concatenation only cost one operation. The lazy-reversal is an O(n) operation—we cannot reverse linked lists faster than that—but we do not execute that operation until we need it. This doesn’t happen before we need to dequeue an element from it.

If we have 2n - 1 elements in the queue, n in the front-list and n-1 in the back-list, and we enqueue one more, we concatenate the reversed back-list to the front-list. If we then branch and dequeue along two computational paths we won’t have to reverse the back-list until we have dequeued all the elements in the front-list.

If we pay one coin into the bank every time we do a constant time operation, we will have n in the bank (from enqueuing) when we do the concatenate-reverse operation. When we branch and dequeue until we need the reversal, we will have 2n in the bank before we reverse. The n we had before the concatenate-reversal and n from dequeuing from the front-list.

This operation leaves n coins in the bank. Along the second branch, we also need n dequeuing operations before we reverse, so we again have 2n in the bank when we reverse. This time the reversal is free—we have already reversed that queue, and with lazy evaluation we remember the result.

We are not limited to two paths here. I have drawn two, but we can compute along as many branches as we wish. We will just have more and more coins in the bank as long as we dequeue and do not pay more than once for the reversal.

There might be an “off by one” error or two in the accounting here, but it isn’t all wrong and I promise you that you can get it exactly right if you are more careful than I have been.

We cannot reuse reversals that are not part of the same concatenate-reversal operation. So, if we leave the enqueuing that triggers this operation until after we branch along the two computational paths, we do not gain anything from the lazy evaluation.

We still need to dequeue n times before we have to perform the expensive reversal, so along any alternative branches we compute, we save enough for the independent reversals.