JS optimization techniques: Tail calls and beyond


Guillaume Lathoud

Alpstein Tourismus GmbH, Germany


slides: http://glat.info

Code that scales: performance

Tight
Unfamiliar

...and team

     Clear             Familiar structure

Lower threshold of adoption

"Tail calls"

Write expressive code without for/while,

and generate fast code.

Limits.



"...and beyond"

Similar spirit, but for array processing.

Tail calls: practical definition

Return the value obtained from a single function call.

// no return statement
t.children.forEach(doSomething);
      
 
// not a single function call
return sumtree(t.left) + sumtree(t.right);
        
// return + single function call        
return gcd_rec(a-b, b);
      

Tail calls example: Greatest Common Divisor (GCD)

Task

Find the GCD of two numbers.

910 = 2*5*7*13

1235 = 5*13*19

5*13 = 65

Solution

Euclid's recursive algorithm:

GCD(a,b) is:
  if a>b: GCD(a-b,b)
  if a<b: GCD(b-a,a)
  if a==b: a


Implementations

function gcd_rec(a, b) {

  if (a > b)
    return gcd_rec(a-b, b);

  if (b > a)
    return gcd_rec(b-a, a);

  return a;
}
alert(gcd_rec(910, 1235));

No loop statement,
only branch & return.

+ Clear: 100% like the algo.

Slow: function calls.

function gcd_loop(a, b) {
  while (a != b) {
    if (a > b) {
      a = a-b;
    }
    else if (b > a) {
      var c = a;
      a = b-a;
      b = c;
    }
  }
  return a;
}
alert(gcd_loop(910,1235));

while, branch & return.

Difficult to read (algo?).

+ Fast: no function call.

tail

call

optim.

Tail call optimization

Implementation:
http://glat.info/jscheck/tomrec_prod.html


Results:
http://glat.info/jscheck/tomrec_prod.html#results

Great! What about large-scale development?


  Not a tail call ⇒ not transformed:

 
// no return statement
t.children.forEach(doSomething);
 // not a single function call
return sumtree(t.left) + sumtree(t.right);

  Tail call ⇒ automatically removed (replaced with while & continue):

// return + single function call
return gcd_rec(a-b, b);

Implicit difference.

Who's going to know? (*)











(*) http://neopythonic.blogspot.de/2009/04/final-words-on-tail-calls.html

Can we explicitly differentiate non-tail calls from tail calls ?

Tail metacomposition

Explicit difference using two new keywords metafun and metaret.

// WITH function calls (call stack)
...
var v = f(x);
...
return f(x); 
...
return f(x)+g(x);
...
o.doSomething();
...

// WITHOUT function calls (no call stack)
metafun gcd(self, a, b) { // metafunction: contains metaret

  if (a > b)
    metaret self, a-b, b; // NOT a call, rather a sanitized goto

  if (b > a) {
    metaret self, b-a, a; // NOT a call, rather a sanitized goto

  return a;
}
            

Tail metacomposition implementation

Webpage: http://glat.info/js.metaret/

Source: https://github.com/glathoud/js.metaret


Simpler than TCO: only where metaret used.

Fast result: Generate a while loop, as before.

metafun gcd(self, a, b) {

  if (a > b)
    metaret self, a-b, b;

  if (b > a)
    metaret self, b-a, a;

  return a;
}
function anonymous(a, b) {
  _L_gcd_: while (true) {

    if (a > b) {
      var _a1_ = a - b;
      a = _a1_;
      continue _L_gcd_;
    }   
   
    if (b > a) {
      var _a_ = b - a;
      var _b_ = a;
      a = _a_;
      b = _b_;
      continue _L_gcd_;
    }
   
    return a;
  }
}

Only good for math?

Tail metacomposition: longer example

Search a value in a sorted array with duplicates:

[ 0, 1, 1, 2, 2, 5, 6, 6, 6, 7, 9, 9 ]
  F                                L    (init)
     F                                  
                    L                   (bisection)
        F                                               
              L                                         
           F                                                  
          #3  #4                                         
metafun improveFirst(self) {

  if (first_found && last_found) // Termination test
    return [i, j];  // Done: `x` found.
 
  ... // Update
     
  metaret improveLast; 
} 
 
metafun improveLast(self) { 
 
  ... // Update
    
  metaret improveFirst; 
} 
http://glat.info/js.metaret/#sec-sorted-search

Tail metacomposition:


Instead of functions, for, while,
function gcd(a,b){ while...
tail calls...
return gcd(a-b, b);
...use metafun
metafun gcd(a, b) { ...
and metaret (sanitized goto):
metaret self, a-b, b;

Write *less* (for, while, functions, calls)

Really? No for? No while? No function call?

metafun sum(self, arr) { 

  var n = arr.length;

  metaret sum_iter, 0, 0;

  metafun sum_iter(self, current, i) {

    // Are we done yet?

    if (i < n) // No
      metaret self, current+arr[i], i+1;

    // Yes
    return current;
  }
}
        
Ugggh... Somehow I still prefer:
var sum = arr.reduce(function (a,b) { return a+b; });
...but when data grows, for speed (http://jsperf.com/arr-reduce):
var sum = 0;
for (var i=arr.length; i--; )
  sum += arr[i];

Looks familiar?

"...and beyond": array processing

map, filter, reduce...

Different context: tail calls/tail metacomposition not so wise here...


...but can we similarly develop a no loop/no call approach?

...and generate fast code?

Especially when data grows?

Code generation for fast array processing

"Data grows": http://www.outdooractive.com/en/tourplanner/

On-the-fly statistics: length, duration etc.

User expects: O(1)

 

Proposition

Write readable code...
sum = alp_reduce(arr,"+");
...and generate fast runtime code:
sum = 0;
for (var i=arr.length; i--;)
  sum += arr[i];
          

Implementation (simplified)

function alp_reduce(arr, /*string*/expr) {
  var f = new Function( "arr,v,i",
    "var n=arr.length;" + 
    "for(var w;i<n;i++){v=(w=arr[i]," + alp_e(expr) + ");}" +
    "return v;");
  return f(arr,arr[0],1);
}
function alp_e(expr) { ... }  // alp_e("+") returns "v+w"
        

Full implementation + speed results

http://jsperf.com/arr-reduce-gen

 

or using a sweet.js macro (thanks to Adam Peresztegi for the suggestion)

Future optimization: multiple stages (1)

Incomplete or corrupt data:
arr = [ { p: 12 }, { p : null }, { p: 34 }, { p: 56 } ];
Sum:
          sum_p = arr.reduce(function (a,b) { return a + (b.p || 0); },0);
          //            1        3                     6   4     5     2
      
Would be more expressive and clearer (operation order):
          sum_p = arr.map(".p").filter("!=null").reduce("+");
          //           1          2                 3 
        

Issue

3 stages means: 3 loops and 2 throw-away intermediary arrays.

Waste of memory and CPU ("data grows").

Future optimization: multiple stages (2)

Proposition

Write 3 stages...
// f is a function (generated)
var f = Array.map(".p").filter("!=null").reduce("+");
          
...which generates efficient single loop code...
function f(arr) {
  var n   = arr.length,
      ret = 0;
      
  for (var i=0; i<n; i++) {
    var v = arr[i];
    v = v.p;         // .map(".p")
    if (v != null)   // .filter("!=null")
      ret = ret + v; // .reduce("+")
  }

  return ret;                  
}
...then use f:
              var sum_p = f(arr);
            

Worth implementing?

http://jsperf.com/arr-map-filter-reduce

Proposition: Fast typed array processing

Proposition: Simple math on typed arrays

          // Element-by-element addition of two *typed* arrays:
          var ta3 = ta1 + ta2;       // JS syntax extension

          // ta1: [ <Float64>,  <Float64>,  <Float64>, ... ]
          // +
          // ta2: [ <Float64>,  <Float64>,  <Float64>, ... ]
          // =
          // ta3: [ <Float64>,  <Float64>,  <Float64>, ... ]


          // or:
          var f = Float64Array.merge("+"); // compile f
          var ta3 = f(ta1, ta2);           // execute f

          // or shortcut:
          var ta3 = ta1.merge("+", ta2);
        

Interpretation: win-win

- Write clear, portable code,

- and make speedup easy (JS engines):

 ☆ parallelization: multicore, SIMD,

 ☆ inlining "+": 1 loop instead of many function calls.

What about more complicate operations?

Proposition: Advanced math on typed arrays

          function getDistance(x0, y0, x1, y1) { 
            var dx = x1-x0;
            var dy = y1-y0;
            return Math.sqrt(dx*dx + dy*dy);
          }

          // compile -> f is a function
          var f = Float64Array.merge(getDistance);

          // execute f element-by-element
          // d, x0, y0, x1, y1: all Float64Arrays
          var d = f(x0, y0, x1, y1); 

        

- Write clear, portable code,

- and make speedup easy (JS engines):

 ☆ Parallelization: element-by-element,

 ☆ Inlining: 1 loop with the body of getDistance,

 ☆ Native code: e.g. when getDistance produced by asm.js.

Applications

3D: parts that do not fit on the GPU,

Digital Signal Processing: speech, video etc. (repeat the smallest array),

...

Conclusion

What was this all about?

Code that scales

sum = arr.reduce( 
  function (a,b){
    return a+b; 
});
  reduce
sum = alp_reduce(arr,"+");
  reduce-gen
     Clear          Familiar structure

Low threshold of adoption

Tail metacomposition:


Instead of functions, for, while,
function gcd(a,b){ while...
tail calls...
return gcd(a-b, b);
...use metafun,
metafun gcd(a, b) { ...
and metaret (sanitized goto):
metaret self, a-b, b;

Arrays, number crunching:


Instead of functions, for, while...
sum=arr.reduce(function...)
for(..)ta3[i]=ta1[i]+ta2[i]
...use "actions":
sum = arr.reduce("+")
ta3 = ta1.merge("+", ta2)

Common spirit: write *less* (for, while, functions, calls)


Think/write/read concise code, in a natural order, with explicit optimizations.

☆ fast!

cost: no stack trace. But writing for/while loops by hand not much better.

You can already use http://glat.info/js.metaret & jsperf.com/arr-reduce-gen


Thank you!


http://glat.info
http://www.outdooractive.com
http://developers.outdooractive.com