## Appendices of 'On-the-fly tail call optimization in Javascript'

In my opinion the article has already enough numbers, tables, comments etc. Here are the appendices.

### A. Comparison with trampoline implementations

Below are two trampoline implementations, `tailcatch` and `tailtramp`, and the results. As expected, they are much slower than `tailopt`. Note that these trampoline implementations both support mutual recursion, which `tailopt` currently does not.

``````// Copyright (c) 2010 Guillaume Lathoud
//
// tailcatch.js
//
// Implement full tail-call optimization in Javascript through
// a throw/catch trampoline.
//
// Heavily inspired from Sjoed Visscher's work:
// http://w3future.com/weblog/2006/02/#tailCallEliminationInJavascript
//
// Difference: I brought some speed improvements:
// - removed the caller search for self.
// - reduced the length of the throw/catch chain from n to 1.

/*jslint evil:false */

function tailcatch(g) {
return (function (n) {

return function()
{
if (n>5) {  // Jump off a small Empire State Building
throw {tailCallArgs: arguments, tailCallThis: this};
}

var args = arguments;
var me   = this;
var ret;
while (true)
{
if (n<1) {
try
{
n++;
ret = g.apply(me, args);
n--;
return ret;
}
catch(e) // Hit 33rd Street and bounce again
{
if (!e.tailCallArgs) {
throw e;
}
n=0;
args = e.tailCallArgs;
me   = e.tailCallThis;
}
} else {
n++;
ret = g.apply(me, args);
n--;
return ret;

}
}
};

})(0);
}
``````

And here is `tailtramp`:

``````// Copyright (c) 2010 Guillaume Lathoud
//
// tailtramp.js
//
// Implement full tail-call optimization in Javascript through a
// trampoline.

/*jslint evil:false */

function tailtramp(g) {
function TailCall(arr) {
this.arr = arr;
}
return (function (n) {

return function()
{
var arr = [g, this, arguments], ret;

if (n>5) {  // Jump off a small Empire State Building
return new TailCall(arr);
}

while (true)
{
n++;
ret = arr[0].apply(arr[1], arr[2]);
n--;

if ((n>0) || (!(ret instanceof TailCall))) {
return ret;
}

// Hit 33rd Street and bounce again
arr = ret.arr;
}
};

})(0);
}
``````

System: Ubuntu 8.04 on a Thinkpad T43 (Intel Pentium M, 2 GHz, 1.5 GB RAM)

V8 version: 2.1.0

Test: 10 runs

``````d8 -e "load('tailopt-js-rerun.js');rerun(10);"
...
fact_rec_________________ iter_sec: 2.680e-7 (100: base)
fact_rectailcatch________ iter_sec: 6.104e-6 (2277)
fact_rectailtramp________ iter_sec: 2.146e-6 (801)
fact_tail________________ iter_sec: 4.190e-7 (156)
fact_tail2_______________ iter_sec: 4.150e-7 (155)
fact_tailcatch___________ iter_sec: 6.691e-6 (2496)
fact_tailtramp___________ iter_sec: 3.204e-6 (1195)
fact_tailopt_____________ iter_sec: 1.897e-7 (71)
fact_tailopt2____________ iter_sec: 1.878e-7 (70)
fact_iter________________ iter_sec: 1.339e-7 (50)
gcd_rec__________________ iter_sec: 2.633e-6 (100: base)
gcd_rectailcatch_________ iter_sec: 5.829e-5 (2214)
gcd_rectailtramp_________ iter_sec: 2.468e-5 (937)
gcd_tailopt______________ iter_sec: 1.092e-6 (41)
gcd_iter_________________ iter_sec: 1.220e-6 (46)
treeforeach_rec__________ iter_sec: 2.255e-4 (100: base)
treeforeach_tail_________ iter_sec: 2.921e-4 (130)
treeforeach_tailcatch____ iter_sec: 4.555e-4 (202)
treeforeach_tailtramp____ iter_sec: 3.590e-4 (159)
treeforeach_tailopt______ iter_sec: 2.843e-4 (126)
``````

### B. Standard deviation

To verify that the performance measurements are somewhat meaningful, I ran each test 500 times and computed the standard deviation. Thanks to Sam Mason for the suggestion.

Reminder: the result `iter_sec` is the duration of a single call, in seconds: the lower, the better.

#### B1. Rhino

System: Ubuntu 8.04 on a Thinkpad T43 (Intel Pentium M, 2 GHz, 1.5 GB RAM)

Java version: 1.5.0, gij (GNU libgcj) version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)

Rhino version: 1.6 release 7 2008 10 14

Test: 500 runs

``````rhino  -e "load('tailopt-js-rerun.js');rerun(510,10);"
...
rerun() started, subset: null , nrunmin: 510 , nruntoforget: 10 , global.runs: 509
..
fact_rec_____________ iter_sec: 3.984e-4 (100: base) [mean:4.037e-4, std:9.700e-6, ratio:2.403e-2]
fact_tail____________ iter_sec: 5.986e-4 (150)       [mean:5.803e-4, std:1.330e-5, ratio:2.292e-2]
fact_tailopt_________ iter_sec: 4.862e-4 (122)       [mean:4.919e-4, std:1.298e-5, ratio:2.638e-2]
fact_tail2___________ iter_sec: 5.714e-4 (143)       [mean:5.783e-4, std:1.364e-5, ratio:2.358e-2]
fact_tailopt2________ iter_sec: 4.745e-4 (119)       [mean:4.749e-4, std:1.066e-5, ratio:2.245e-2]
fact_iter____________ iter_sec: 1.167e-4 (29)        [mean:1.188e-4, std:3.392e-6, ratio:2.857e-2]
gcd_rec______________ iter_sec: 3.000e-3 (100: base) [mean:3.039e-3, std:9.211e-5, ratio:3.031e-2]
gcd_tailopt__________ iter_sec: 2.964e-3 (99)        [mean:2.977e-3, std:4.393e-5, ratio:1.476e-2]
gcd_iter_____________ iter_sec: 8.356e-4 (28)        [mean:8.045e-4, std:1.424e-5, ratio:1.770e-2]
treeforeach_rec______ iter_sec: 5.363e-2 (100: base) [mean:5.414e-2, std:2.201e-3, ratio:4.065e-2]
treeforeach_tail_____ iter_sec: 5.474e-2 (102)       [mean:5.509e-2, std:6.106e-4, ratio:1.108e-2]
treeforeach_tailopt__ iter_sec: 6.244e-2 (116)       [mean:6.269e-2, std:1.004e-3, ratio:1.602e-2]
``````

The ratio std/mean is about 2-3% in most cases. Only for `treeforeach_rec`, the ratio is about 4%. Conclusion: the duration differences are meaningful for `fact` and `gcd`, and the three variants of `treeforeach` perform similarly in average.

System: Ubuntu 8.04 on a Thinkpad T43 (Intel Pentium M, 2 GHz, 1.5 GB RAM)

V8 version: 2.1.0

Test: 500 runs

``````d8 -e "load('tailopt-js-rerun.js');rerun(510,10);"
...
rerun() started, subset: null , nrunmin: 510 , nruntoforget: 10 , global.runs: 509
...
fact_rec_____________ iter_sec: 2.657e-7 (100: base) [mean:2.636e-7, std:1.087e-8, ratio:4.126e-2]
fact_tail____________ iter_sec: 4.011e-7 (151)       [mean:4.040e-7, std:5.537e-9, ratio:1.371e-2]
fact_tailopt_________ iter_sec: 1.855e-7 (70)        [mean:1.866e-7, std:4.243e-9, ratio:2.274e-2]
fact_tail2___________ iter_sec: 4.078e-7 (153)       [mean:4.103e-7, std:6.752e-9, ratio:1.645e-2]
fact_tailopt2________ iter_sec: 1.857e-7 (70)        [mean:1.866e-7, std:3.168e-9, ratio:1.698e-2]
fact_iter____________ iter_sec: 1.311e-7 (49)        [mean:1.322e-7, std:3.324e-9, ratio:2.514e-2]
gcd_rec______________ iter_sec: 2.659e-6 (100: base) [mean:2.651e-6, std:4.979e-8, ratio:1.878e-2]
gcd_tailopt__________ iter_sec: 1.061e-6 (40)        [mean:1.068e-6, std:2.215e-8, ratio:2.073e-2]
gcd_iter_____________ iter_sec: 1.197e-6 (45)        [mean:1.204e-6, std:2.985e-8, ratio:2.479e-2]
treeforeach_rec______ iter_sec: 2.224e-4 (100: base) [mean:2.222e-4, std:4.170e-6, ratio:1.877e-2]
treeforeach_tail_____ iter_sec: 2.855e-4 (128)       [mean:2.860e-4, std:5.037e-6, ratio:1.761e-2]
treeforeach_tailopt__ iter_sec: 2.754e-4 (124)       [mean:2.778e-4, std:2.647e-6, ratio:9.528e-3]
``````

Except for `fact_rec`, the standard deviation is only about 2% of the mean, so comparisons between means seem pretty safe! Indeed, for most comparisons the differences include much more than 3 times the standard deviation of each compared value (with a Gaussian assumption that means, quickly said, more than 99% confidence). Only `gcd_tailopt` and `gcd_iter` may be deemed similar (which is quite good!), as well as `treeforeach_tail` and `treeforeach_tailopt` (not great).

#### B3. String Replication

These are the detailed results for the String Replication section on 500 runs, in the form of:

• `mean` duration of a single iteration (in seconds) over the 500 runs - the lower, the better. In brackets, relative values, where base 100 is the recursive result,
• `std` standard deviation of the single iteration duration over the 500 runs,
• `ratio = std / mean`, which indicates the stability of the result (high for Firefox 3.6, probably because of some chronic garbage collection for the `tailopt` and `iter` versions).
``````
Firefox 3.6, Thinkpad Linux Ubuntu, 510 runs total (the 10 first runs ignored):
string_repli_rec______________ mean: 5.972e-5 (100.0), std:5.302e-6, ratio:8.878e-2
string_repli_tail_____________ mean: 1.582e-5 ( 26.5), std:1.212e-6, ratio:7.660e-2
string_repli_tailopt__________ mean: 1.105e-5 ( 18.5), std:1.261e-6, ratio:1.141e-1
string_repli_iter_____________ mean: 9.818e-6 ( 16.4), std:1.268e-6, ratio:1.291e-1

Google Chrome 4.0.249.43, Thinkpad Linux Ubuntu, 510 runs total (the first 10 runs ignored):
string_repli_rec______________ mean: 5.312e-5 (100.0), std:2.287e-6, ratio:4.305e-2
string_repli_tail_____________ mean: 4.075e-6 (  7.7), std:2.415e-7, ratio:5.926e-2
string_repli_tailopt__________ mean: 3.479e-6 (  6.5), std:2.178e-7, ratio:6.261e-2
string_repli_iter_____________ mean: 3.361e-6 (  6.3), std:2.337e-7, ratio:6.955e-2

Google V8, Thinkpad Linux Ubuntu, 510 runs total (the first 10 runs ignored):
string_repli_rec______________ mean: 3.324e-5 (100.0), std:4.964e-7, ratio:1.493e-2
string_repli_tail_____________ mean: 2.381e-6 (  7.2), std:1.742e-7, ratio:7.314e-2
string_repli_tailopt__________ mean: 2.231e-6 (  6.7), std:1.242e-8, ratio:5.565e-3
string_repli_iter_____________ mean: 2.166e-6 (  6.5), std:4.501e-8, ratio:2.078e-2

Rhino, Thinkpad Linux Ubuntu, 510 runs total (the first 10 runs ignored):