Jeremy Fairbank bio photo

Jeremy Fairbank

Software Engineer and Consultant. Hawaii. Author of Programming Elm and Conference Speaker.

Twitter LinkedIn Instagram Github

Contents

This post picks up from a series on functional programming I started a few months ago. In the last post, I explained recursion and immutability through constructing the list data structure. In this post, we will examine another concept in functional programming: streams.

Streams

So, what are streams? I am sure you have probably heard of streams in JavaScript and Node before, and streams in functional programming are similar. One important use case for streams in functional programming is to represent infinite data sequences such as the natural numbers.

But what is so special about streams? Well, just like streams in Node, we can work with data in small chunks, avoiding exhausting memory. Imagine trying to represent all the natural numbers in memory! When I say that we work in “small chunks,” I mean that we delay getting values in a stream until necessary. This is called lazy evaluation.

With a stream of natural numbers, that means I would start the stream with the value 1. I would have to consume the stream again to get the next value of 2. Repeating this process would yield 3, 4, and so on.

Creating Streams with Generators

If you have kept up with ES2015, this concept might sound familiar. Generator functions afford us the power of streams in JavaScript. So how would we create a stream of the natural numbers with generators?

natural-numbers-generator-imperative.jslink
function *naturalNumbers() {
let n = 1;
while (true) {
yield n++;
}
}
const nats = naturalNumbers();
console.log(nats.next().value) // 1
console.log(nats.next().value) // 2
console.log(nats.next().value) // 3

This is a pretty straightforward implementation. We keep track of the current number in the sequence with n. We set n = 1 because naturally (pun intended) the natural numbers start with 1. Then, inside an infinite loop, we yield n and increment it to the next number in the sequence.

However, this implementation is not very functional. We use mutation by reassigning values to n. We could alter the implementation to use recursion:

natural-numbers-generator-recursive.jslink
function *naturalNumbers() {
function *_naturalNumbers(n) {
yield n;
yield *_naturalNumbers(n + 1);
}
yield *_naturalNumbers(1);
}
const nats = naturalNumbers();
console.log(nats.next().value) // 1
console.log(nats.next().value) // 2
console.log(nats.next().value) // 3

Now we create a generator function that uses an internal generator function called _naturalNumbers. It takes n as a parameter with an initial value of 1. Inside our internal function, we yield n and then recursively yield a call to the same function with n + 1. Notice the use of yield *. This special syntax allows us to delegate a yielded value to another generator function. If we had just used yield, we would have yielded a new instance of the generator instead of the next number in the sequence.

There is still a problem with this implementation, though. Notice how we consume our values with nats.next().value. That generator has to maintain state internally for it to be able to yield each value in the sequence. That means some mutation is taking place. If we want to be super strict about immutability with functional programming, then we need to roll our own stream data structure.

Creating the Stream Data Structure

How do we accomplish creating a stream data structure? Remember that streams are lazy structures. We evaluate the first value but delay evaluating the remaining values. That sounds like we need some type of callback to represent the remaining values.

Hey! This is starting to sound like a list too. So, we need to create some type of data structure with a head and a tail, where the tail is evaluated at a later point in time. An implementation could look something like this then:

natural-numbers-functional-stream-1.jslink
function naturalNumbers() {
function _stream(n) {
return {
value: n,
next() {
return _stream(n + 1);
}
};
}
return () => _stream(1);
}
const nats = naturalNumbers();
const one = nats();
const two = one.next();
const three = two.next();
console.log(one.value);
console.log(two.value);
console.log(three.value);

Now this is definitely more functional. We remove mutation and depend on recursive calls of an internal function to generate the values in our stream. Let’s walk through this implementation.

I mentioned we use recursive calls to an internal function. We create the internal function _stream that takes in our n parameter. It returns a list-like data structure. The head is the value property which contains the value of n. The tail is a function called next that returns a recursive call to _stream with n + 1 as an argument. This is similar to the recursive generator implementation we had earlier. Finally to kick it all off, we return a closure that calls _stream with our initial value of 1.

Now one thing that is slightly irritating is the way we have to consume values. We consecutively have to call the initial stream or the next function. What we need is some helper function that can grab multiple values from the stream and return them.

The take Function

To help us solve the problem of consuming multiple values from a stream, we can create the take function. take is a function that takes a number n and a stream str and returns the first n values of str. An implementation of take would look like:

take.jslink
function take(n, str) {
function _take(n, str, accum) {
if (n === 0) {
return accum;
}
const { value, next } = str();
return _take(n - 1, next, accum.concat(value));
}
return _take(n, str, []);
}

Again, we use an internal function to help us. (We could use an optional parameter with the outer function instead, but that might not be considered strictly functional.) The internal function _take takes the same argument types as the outer function take, n and str, as well as another argument accum. accum is short for accumulator, and it will be an array that holds the values we have consumed from the stream.

We use recursion with _take to obtain our stream values. We grab the current value and the next function from the current stream at line 7. Then we make the recursive call at line 9. Notice we pass in n - 1 and the next function as our new stream. We also concatenate the current value to the accum array, producing a new array that we pass in as the new accumulator.

Remember that we decrement n with n - 1 in our recursive call. This is important because it will lead to our base case when n === 0. At that point, we no longer want to consume values from the stream, so we return whatever is in accum.

This implementation is not without its faults because it will have a quadratic runtime thanks to our calls to accum.concat. We could fix that with mutation via array.push or using a list like the one in my last functional programming post, but we won’t worry about that here.

So, let’s use our new take function to get the first ten natural numbers:

natural-numbers-take.jslink
const nats = naturalNumbers();
const firstTen = take(10, nats);
console.log(firstTen);
// [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Great! This is not only simpler, but a more declarative way of consuming values from a stream.

Creating Other Streams

Now we could apply our new stream technique to create other infinite sequences. What if we wanted to create a stream of the Fibonacci sequence? Remember the Fibonacci sequence is a sequence of numbers where every number is the sum of the previous two numbers in the sequence. We also need two base case numbers, 0 and 1. Therefore, the sequence would be 0, 1, 1, 2, 3, 5, 8, etc. Implementing this as a stream could look something like this:

fibonacci-sequence-stream-1.jslink
function fibonacciSequence() {
function _stream(current, next) {
return {
value: current,
next() {
return _stream(next, current + next);
}
};
}
return () => _stream(0, 1);
}
const fibs = fibonacciSequence();
const firstTen = take(10, fibs);
console.log(firstTen);
// [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Notice anything? Reapplying the stream technique to the Fibonacci sequence seems duplicative. What we really need is a more DRY approach to creating streams. Therefore, we need to abstract the construction of the stream data structure into a reusable function. Then, we can build streams with just a function call:

stream.jslink
function stream(fn, initial) {
function _stream(value) {
return {
value,
next() {
return _stream(fn(value));
}
};
}
return () => _stream(initial);
}

This implementation closely models what we already did with the natural numbers and the Fibonacci sequence. We create an internal _stream function that takes some value and then returns an object literal that wraps that value and a closure that produces the next value. Notice in the next function when we make our recursive call that we also call the fn argument that was supplied to stream. This is the function that is responsible for actually generating the next value. In the case of the natural numbers, it would be a function that adds 1 to its argument.

Now we can convert our two streams to use our new stream function:

stream-usage.jslink
const nats = stream(n => n + 1, 1);
const fibs = stream(([current, next]) => [next, current + next], [0, 1]);
console.log(take(10, nats));
// [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
console.log(take(10, fibs));
// [0, 1, 1, 1, 1, 2, 2, 3, 3, 5, 5, 8, 8, 13, 13, 21, 21, 34, 34, 55]

With our new stream function, all we have to do is provide a recurrence function and an initial value. Whatever is returned from the recurrence function is used as the next value in the stream and is also the next argument to the recurrence function (recurrence inception, I know). For the natural numbers, that recurrence is simply adding 1 to the last number: n + 1.

But wait a minute! What is going on with the Fibonacci sequence? Our recurrence function takes in an array and returns an array. Also, the output from console.log(take(10, fibs)) seems to be duplicating numbers. Recall that our stream uses the return value from the recurrence function as the next argument for the recurrence function. Also, recall that the Fibonacci sequence is defined as the sequence of numbers where each number is the sum of the previous two numbers. Therefore, we must use an array to keep track of the previous two numbers. That presents a problem when we actually consume numbers in the Fibonacci sequence. We only want the first element in each array returned from the recurrence function. What we need is some way to map over streams.

Mapping Over Streams

If streams are like lazy lists, then surely we can map over them like lists. However, we need to ensure that a map function for streams does not consume the stream. That defeats the purpose of the stream — not to mention that that would cause an infinite loop.

Thus, map must produce a new stream itself. Then, when we consume the mapped stream, we will get back the modified values of the original stream. We can implement map like so:

map.jslink
function map(fn, originalStr) {
function _stream(str) {
const { value, next } = str();
return {
value: fn(value),
next() {
return _stream(next);
}
};
}
return () => _stream(originalStr);
}

Essentially, what we do in map is reimplement our stream pattern from the stream function. Our initial value will be the stream over which we want to map. We also use an internal helper function called _stream that actually produces the stream data structure.

When we return our stream data structure, we call the mapping function fn on the recently consumed value from the original stream to produce the new value property. With this implementation, we can lazily consume and transform the values from the original stream. Now, we can fix our Fibonacci sequence along with the aid of a helper function called first:

fibonacci-mapped.jslink
function first(array) {
return array[0];
}
const fibs = map(first, stream(([current, next]) => [next, current + next], [0, 1]));
console.log(take(10, fibs));
// [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

Awesome! That fixes our issue and feels very functional, at least in the sense of composing functions to produce our stream. We could even use map to create the sequence of even numbers from our sequence of natural numbers:

even-numbers-sequence.jslink
const nats = stream(n => n + 1, 1);
const evenNumbers = map(n => n * 2, nats);
console.log(take(10, evenNumbers));
// [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Filtering Streams

So, we can map over our streams, but what if we wanted to filter values? For example, with filtering we could create the sequence of odd numbers from our natural numbers, or we could filter out numbers that are less than 42. Unsurprisingly, we can write a filter function that is similar to our map function:

filter.jslink
function filter(fn, originalStr) {
function _stream(str) {
const { value, next } = str();
if (fn(value)) {
return {
value,
next() {
return _stream(next);
}
};
}
return _stream(next);
}
return () => _stream(originalStr);
}
filter-usage.jslink
const nats = stream(n => n + 1, 1);
const oddNumbers = filter(n => n % 2 !== 0, nats);
const gte42 = filter(n => n >= 42, nats);
console.log(take(10, oddNumbers));
// [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
console.log(take(10, gte42));
// [42, 43, 44, 45, 46, 47, 48, 49, 50, 51]

filter, like map, is very similar to the stream function. It uses the original stream as the initial value, and its internal function _stream filters over the consumed values of the original stream. _stream accomplishes this by consuming a value and then running the filter callback fn on it. If fn returns true, then _stream will essentially return what the original stream data structure would have returned except the new next function will filter on the next function from the original stream. If fn returns false, then _stream will delegate its result by recursively calling itself with the next function from the original stream.

Conclusion

Streams are a powerful concept in JavaScript and functional programming. We can represent infinite sequences without consuming a ton of memory. We can also use the concepts of composition to create new sequences from other sequences through lazy helper functions like map and filter. This notion of composition is critical to functional programming and helps reinforce other concepts like modularity and the single responsibility principle. Streams are an important tool in the bag of functional and JavaScript programmers.