sweet-tensors

##Tensor index notation in javascript using sweet.js

This sweet.js macro allows you to write javascript for loops using the power and simplicity of the index notation used in tensor calculus.

So this:

for (var j = 0, lj = baz.length; j < lj; j++) {
  for (var i = 0, li = bar.length; i < li; i++) {
    foo[i] = bar[i] + baz[j];
  }
}

becomes this:

tensor foo[i][j] = bar[i] + baz[j];

You can test the macro out for yourself using the online editor

###What is Index notation?

Not familiar with index notation? Observe:

foo[i][j] = bar[i] + baz[j]

This is the sort of code you normally expect to see inside a double for loop. You know that because you see there are two index variables, i and j. i is used to iterate through bar, and j is used to iterate through baz. The lower bound of i and j is 0. The upper bound of i and j is determined by the size of their respective arrays. For every value in bar and baz, we populate a cell in foo with their sum. Simple.

You know all this right away and you never looked at a single for loop. You pieced it all together through convention and reasoning. The for loops are just formality.

Tensor index notation skips the formality. In tensor calculus, the "for loops" are implied by the index. There are a few other rules that govern its use in mathematics, but for our sake we will ignore them.

If we want to implement this index notation, a macro need only traverse the expression, find the indices, and associate each index with the array that preceded it. We no longer have to write for loops. The only fluff we still need is a keyword to kick off the macro.

###Installation

You will need to install the latest versions of nodejs (>4.x) and npm (>2.x) if you haven't done so already. You will also need to install the latest version of Sweet.js (>1.0) in your project directory using npm. Instructions on how to do so can be found in the Sweet.js tutorial.

Once Sweet.js is installed in your project directory, you have several options on how to use sweet-tensors, as with any Sweet.js macro.

The simplest option is to copy the contents of sweet-tensors.sjs to the top of an existing file where you want to use the macro. When you want to transpile the macro, you would run the following command:

nodejs --harmony node_modules/.bin/sjs --module your-file.sjs js  > your-file.js;

Here, "your-file.sjs" is the file you've copied the macro to.

Another option is to copy sweet-tensors.sjs and setup a build process that concatenates the files and transpiles the result. The build process would look something like this:

mkfifo js;
cat sweet-tensors.sjs your-file.sjs > js &
nodejs --harmony node_modules/.bin/sjs --module sweet-tensors.sjs js  > your-file.js;

Again replacing "your-file.sjs" with the file that uses the macro. You can see an example of this build process in the "demo.sjs" and "demo.sh" files included in the sweet-tensors project.

If you're looking for a more robust build tool, you may also consider the following options:

###Tutorial

tensor foo[i][j] = bar[i] + baz[j];

tensor is the keyword that kicks off the macro. It separates the code that uses the index notation from the code that doesn't. The macro will apply only to the statement that immediately follows the tensor keyword. If you want multiple lines to share the same indices, you can also wrap your code within a pair of braces, just as you would with any if or for loop.

tensor {
	foo[i][j] = bar[i]
	foo[i][j] += baz[j]
}

A single block of tensor code will be wrapped in one for loop for every index that occurs within the block.

An "index" is defined as any single letter variable that occurs at least once alone in brackets.

tensor foo[i] = bar[map[i]] + baz[j-1] + k;

In the example above, map is not an index because its name contains more than one letter. The constant k is not an index because it never occurs within a pair of brackets and we have no way to determine its upper bound. The variable j is not an index because when it occurs inside the brackets it is accompanied by a -1, and it is not certain in the general case for the macro to know what the bounds of j should be. The variable i is indeed an index because it is a single letter variable and it occurs at least once by itself in a pair of brackets.

The block of code will be wrapped in one for loop for every index within the block. The order with which the for loops are applied will depend on whether any index relies on another to ascertain its upper bounds. Take for instance the following statement:

tensor f( foo[j][i] );

The upper bound of i is determined by foo[j].length. This means that index i has a dependency on index j. The for loop for i has to reside within the for loop for j, so the statement above will expand out to the following:

for (var j = 0, lj = foo.length; j < lj; j++) {
  for (var i = 0, li = foo[j].length; i < li; i++) {
    f(foo[j][i]);
  }
}

In the event no dependencies exist between indices, the order of their for loops is arbitrary.

The lower bound of an index is always 0. The upper bound of an index is determined through the length of a single array where the index is used. In the event there are multiple arrays to choose from, the macro will avoid using multi-dimensional arrays and arrays that are returned from functions. In the following code:

tensor foo[j][i] = bar[i] + baz()[i];

The upper bound of i is set to bar.length. The macro could use foo[j].length, but chooses not to because this would be inefficient. The macro could use baz().length, but chooses not to because calling baz() may introduce unwanted side-effects. It is up to the user to ensure foo[j] is always same size as bar and baz(). It is also up to the user to ensure functions such as baz() do not contain side effects.

The upper bound of an index can be retrieved by appending "l" to the start of the variable name, e.g., the upper bound of index i is li. This is sometimes a more efficient way to retrieve the length of an array because it is not reevaluated at every invocation, as it would be with .length.

tensor mean += foo[i] / li;

The example above calculates the mean of all values within foo. It is more efficient than calling:

tensor mean += foo[i] / foo.length;

However, please note it is not the most efficient method, because division occurs with every iteration. A more efficient implementation would be:

tensor mean += foo[i];
mean /= foo.length;

The tensor macro is a dumb thing. It doesn't check to see when there is a division by a constant. All it does is expand the for loop in the most efficient way afforded by the general case.

###Example Usage You can use tensors to replicate most of the built in support for functional programming:

// 		foo = bar.map(map_fn);
tensor 	foo[i] = map_fn(bar[i]);

// 		foo = bar.filter(filter_fn);
tensor 	if( filter_fn(bar[i]) ) foo.push( bar[i] );

//      foo = bar.reduce(reduce_fn, 0);
var foo = 0;
tensor 	foo = reduce_fn( foo, bar[i] );

//		foo.forEach(forEach_fn);
tensor 	forEach_fn( foo[i] );

In certain cases the tensor statements will be faster than the built in equivalents. The use of tensors will almost always be faster than using anonymous functions.

Other built in methods can be replicated, as well. The practicality is questionable, but it does demonstrate the macro's versatility.

// 		foo.fill(0);
tensor 	foo[i] = 0;

// 		foo.reverse();
tensor {
	if(i>li/2) break;
	var temp = foo[i];
	foo[i] = foo[i-1];
	foo[i-1] = temp;
}

//		foo = bar.every(test_fn)
var foo = true;
tensor 	foo = foo && test_fn(bar[i]);

//		foo = bar.some(test_fn)
var foo = false;
tensor 	foo = foo || test_fn(bar[i]);

The macro also allows you to easily borrow paradigms from other languages. Take for instance the logical index vector in R:

var		strings = ['a','b','c','d','e'];
var		bools 	= [false, true, false, true, false];
var 	filtered = [];
tensor 	if(bools[i]) filtered.push( strings[i] );

The numeric index vector:

var		strings = ['a','b','c','d','e'];
var		nums 	= [2, 3, 5];
var 	filtered = [];
tensor 	filtered.push( strings[nums[i]] );

Or the which() function:

var		bools 	= [false, true, false, true, false];
tensor 	if(bools[i]) which.push(i);

And linear algebra is trivial, of course:

tensor 	a += b[i] * c[i]; 			// dot product
tensor 	a[i] = b[i] + c[i]; 		// addition
tensor 	a[j] += b[i][j] * c[i]; 	// matrix * vector
tensor 	a[i][k] += b[i][j] * c[j][k]; // matrix * matrix

davidson16807 / sweet-tensors

sweet-tensors

About

Languages