## Tail-call optimization for mutual recursion without trampoline, in Javascript

#### by Guillaume Lathoud

Programming with tail call optimization combines high performance and algorithmic clarity. The present page provides a clean, solid implementation that optimizes mutual tail recursion in Javascript.

Update 2011-12-13: The interactive area now reports syntax errors and tailopt errors.

Update 2013-03-19: Much lighter implementation: js.metaret

Update 2018-06-28: Even lighter implementation: fext.js

Great thanks to Aubrey Jaffer and Brendan Eich for their seminal works, Schlep Toolchains and the Narcissus parser, and to Nikita Vasilev for the feedback.

See also the ⇒ production examples and an integration example with object methods.

### Motivation

Encapsulating code into many small functions helps to clarify what each part does. To illustrate this, here is a Greatest Common Divisor (GCD) example:

``````function gcd(a,b) {
if (a > b) return gcd(a-b, b);
if (a < b) return gcd(b-a, a);
return a;
};``````

The function calls `gcd(a-b, b)` and `gcd(b-a, a)` make clear what happens in the various cases. For a real-life example, have a look at the sorted array search code.

But when functions call each other too often, things slow down a lot. Can we do something about it?

### Intermezzo: Tail calls

A tail call is a `return` followed by a single function call, as in:

``return gcd(...);``

Anything else is not a tail call:

• A `return` with a composed expression is not a tail call:
``return f( a ) + f( b );``
• A function call without a `return` is not a tail call:
``var x = f( ... );``
• You can find more counter-examples in two unit tests.

If you did not know anything about tail calls, I strongly encourage you to read SICP Section 1.2.

### Proposed approach

The GCD code:

``````function gcd(a,b) {
if (a > b) return gcd(a-b, b);
if (a < b) return gcd(b-a, a);
return a;
};``````

has two tail calls:

``return gcd(a-b, b);``

and:

``return gcd(b-a, a);``

In essence, the GCD function is repeating itself, just changing the parameter values `a` and `b`. This is an iterative process. So we can automatically transform the code into a loop, where the function calls have been eliminated:

``````function gcd(a,b)
{
var a, b, a_new, b_new;

_L_gcd_: while(true)
{
if(a > b)
{
a_new = a-b;
b_new = b;
a = a_new;
b = b_new;
continue _L_gcd_;
}
if(a < b)
{
a_new = b-a;
b_new = a;
a = a_new;
b = b_new;
continue _L_gcd_;
}
return a;
}
};``````

The resulting code runs much faster. To summarize, we have a readable, neatly encapsulated original code (useful for development and maintenance) and a fast generated code (useful in production).

### Implementation

Explanation

To avoid function decompilation issues, and to prevent variable name collisions, I opted for an implementation that transforms one string of Javascript code into another string. This is essentially meant to be used at build time, for example just before compressing the javascript code (short example).

To avoid surprises, the programmer must explicit which parts of the code need to be optimized, wrapping the functions in a `tailopt( ... )` call:

``````var gcd = tailopt( function(a,b) {
if (a > b) return gcd(a-b, b);
if (a < b) return gcd(b-a, a);
return a;
});``````

`tailopt( ... )` acts like the identity function - it can be seen as a marker function:

• During development, `tailopt()` returns the original function, unmodified,
• During build, the transformation:
``optimized_code_string = tailopt.str( original_code_string ).new_code();``
replaces in `original_code_string` all occurences of `tailopt( ... )` with optimized code. Only those parts will be replaced.

This way, the exact same code used for developement (and debugging) is fed into the build system.

Code

You can download the whole tarball. The top-level implementation (./tailopt.js) consists in:

• `tailopt()`: The marker function,
• `tailopt.str()`: The code transformation function.

The corresponding unit tests are in ./test.unit.tailopt.js and ./test.unit.tailopt_str.js. They may be interesting as cookbook. A short example of integration into a build system is ./prep_tailopt.js. It runs fine on Google's V8 Engine, and has been used to produce the present page.

### Performance results

Speed is expressed in number of iterations per second (the higher, the better). I did the tests on a laptop with Ubuntu linux. You can all performance tests right here on your computer (results marked with «??»).

• ch: Google Chrome 7.0.517.41 beta
• ff: Firefox 3.6.13 (Firebug off)
ExampleMutual recursionTail-recursiveTail-optimized
Speed
ratio
Click on a button to reset
the interactive area (further below)
(iter/sec)(iter/sec)(unitless)
No
No
No
Yes
(2 functions)
Yes
(3 functions)
Explanation
Yes
(2 functions)

All cases show a speed improvement, even for algorithms that already run in O(log n) (string replication and sorted search). The dramatic improvement in the GCD/Chrome case suggests a synergy between the built-in engine optimizations and the proposed tail-call elimination.

Overall, the optimization seems particularly indicated when:

• There is branching, and it is not easy to predict which if-else branch will be taken from one «visit» to the next (e.g. GCD).
• The implementation is decomposed in small actions, each encapsulated in a function (= looping a lot).

Having an automatic code optimization permits to take advantage of these two points, and clearly separate levels of abstraction and individual actions.

### Interactive area

Code input:

Code output:

Benchmark implementation:

Benchmark result:

### Details

Why not trampoline?

A concise trampoline tail-call optimization can solve mutual recursion cases directly (my previous work has examples). It is however dead slow because of the cost of creating and destructing function execution contexts (= the stack). A trampoline implementation could make sense in the Javascript engine itself. This is still an open issue, since the engine would have to first decide whether the code can be safely optimized.

Why not a `switch` statement?

The code optimization basically simulates `goto` statements using imbricated `while` loops (and `continue`), where a given function implementation may appear multiple times. A simpler alternative would have been a single `while` loop, with a flat `switch...case` in it, where a given function implementation appears only once. The `switch` variant leads however to much slower solution (e.g. 6 times slower in the modulo3 example).

What is the difference with your previous work?

My previous work was an ad-hoc implementation of tail recursion optimization, where a single function calls itself. Mutual recursion was not supported, and there was no safety against namespace collision.

The present page provides a cleaned-up, solid implementation:

• Added: Support for mutual recursion.
• Added: Integration into a build framework ⇒ production examples.
• Added: Collision resolution through automatic renaming of local variables.
• Added: Support for object methods.
• Dropped: Shoddy regular expressions and function decompilation.

What does the `tailopt()` function?

Nothing. Roughly speaking, `tailopt()` is just the identity function: `tailopt( f ) === f`, and can thus be seen as a marker, which tells that a part of the code needs to be replaced with an optimized version. This decision is taken by the programmer, thus avoiding surprises.

Its sibling, the `tailopt.str()` function, does the job, transforming one string of Javascript code into another string, changing only the parts marked with `tailopt()`. In the interactive area above,

``output_code_string = tailopt.str( input_code_string ).new_code();``

Why not function decompilation?

Because (1) its behaviour is not fully specified in the ECMAScript standard (3, and I believe 5 as well), and (2) Javascript functions are first-class objects, thus anyone can replace their `toString` method in an arbitrary manner, making function decompilation completely unreliable.

The `toSource` method of Firefox might help here, however it has not been adopted by other browsers.

These issues with function decompilation explain the design I adopted, using a passive `tailopt()` marker, as explained just above.

Are mutual-recursive object methods supported?

Yes. You have to explicit `"this.<method_name>"`. Here is a mutual-recursive example:

``````my_object = {
meth_1 : tailopt( "this.meth_1", function (...) {... return this.meth_2(...); ...} ),
meth_2 : tailopt( "this.meth_2", function (...) {... return this.meth_1(...); ...} )
};``````

Who can more, can less: self-recursive methods, as used in at least one large application: tailopt.js itself:

``````
{
...
, fetch_all_dependencies : tailopt(

"this.fetch_all_dependencies"

, function (/*?array?*/output, /*?array?*/input)
{
if (!arguments.length)  // Start
return this.fetch_all_dependencies( [], [].concat( this._arr ) ); // concat: copy

if (!input.length)  // End
return output;

// Iter

// Note: the line below is NOT a tail call
output = output.concat( this.fetch_dependencies( input.shift() ) );

// Note: the line below is a tail call
return this.fetch_all_dependencies( output, input );
}
)
...
}``````

... which, after applying prep_tailopt.js, becomes:

``````{
...
, fetch_all_dependencies: function (output, input) {
var undef, output, input, output_new, input_new;

_L_fetch_all_dependencies_: while (true) {
if (!output  ||  !input) {
output_new = [];
input_new = [].concat(this._arr);
output = output_new;
input = input_new;
continue _L_fetch_all_dependencies_;
}
if (!input.length) return output;
output = output.concat(this.fetch_dependencies(input.shift()));
{
output_new = output;
input_new = input;
output = output_new;
input = input_new;
continue _L_fetch_all_dependencies_;
}
return;
}
}
...
}
``````

### Example: Searching a sorted array

The task: search a sorted array with holes and repetitions for the first and last occurences of an element, or return `null` if not found. For example, in the array of integers `[1, 2, 2, 4, 5, 11, 11, 11, 13, 14]`, the first and last occurences of 11 are at positions 5 and 7, respectively.

I used this in real-life applications (autocomplete, QR trees) to implement a fast search in the frontend without having to maintain a complex data structure. This comes particularly handy when the data can be modified.

Here is a tail-call based implementation that does joint binary search of the first and last occurences of `x`, clearly separating the two steps into two different functions `searchFirst` and `searchLast`.

``````Array.prototype.sortedSearch = tailopt(
function (x, less, equal)
{
// Assume `this` is a sorted array. Search for the first and
// last occurences of `x`.
//
// If `x` found, return `[ first_index, last_index ]`
// (integers).
//

less  = less || function (a,b) { return a < b; };
equal = equal || function (a,b) { return a == b; };

var sortedArray = this;

return searchFirst(0, sortedArray.length - 1, false, false);

// Implementation: joint binary search.

function searchFirst(i, j, first_found, last_found)
{
first_found = first_found || isFirstFound(i);

// Termination tests

if (i > j)

if (first_found && last_found)
return [i, j];  // Done: `x` found.

// Update

if (!first_found)
{
i++;

var ind = i + ((j - i) >> 1)
, v_ind = sortedArray[ind]
;

if (less(v_ind, x) || isFirstFound(ind))
i = ind; // Update
}

return searchLast(i, j, first_found, last_found);
}

function searchLast(i, j, first_found, last_found)
{
last_found = last_found || isLastFound(j);

// Termination tests already done in `searchFirst` -> not needed here.

// Update

if (!last_found)
{
j--;

var ind = j - ((j - i) >> 1)
, v_ind = sortedArray[ind]
;

if (less(x, v_ind) || isLastFound(ind))
j = ind; // Update
}

return searchFirst(i, j, first_found, last_found);
}

function isFirstFound(i)
{
return equal(x, sortedArray[i])  &&  (i < 1  ||  !equal(x, sortedArray[i - 1]));
}

function isLastFound(j)
{
return equal(x, sortedArray[j])  &&  (j > sortedArray.length - 2  ||
!equal(x, sortedArray[j + 1]));
}
}
);
``````

Produced on 2014-08-13 by tomrec.scm - by Guillaume Lathoud (glathoud _at_ yahoo _dot_ fr) 