# ast-types
# Introduction
The ast-types (opens new window) module provides an Esprima (opens new window)-compatible implementation of the abstract syntax tree type hierarchy (opens new window) that was leaded by a project called Mozilla Parser API JavaScript:SpiderMonkey:Parser API (opens new window).
# SpiderMonkey Parser API and estree/estree
WARNING
NOTE: The page JavaScript:SpiderMonkey:Parser API (opens new window) describes SpiderMonkey-specific behavior and is incomplete. Visit ESTree spec (opens new window) for a community AST standard that includes latest ECMAScript features and is backward-compatible with SpiderMonkey format.
See the estree org (opens new window) and the estree repo (opens new window):
Once upon a time, an unsuspecting Mozilla engineer (opens new window) created an API in Firefox that exposed the SpiderMonkey engine's JavaScript parser as a JavaScript API. Said engineer documented the format it produced (opens new window), and this format caught on as a lingua franca for tools that manipulate JavaScript source code.
Meanwhile JavaScript is evolving. This site (opens new window) will serve as a community standard for people involved in building and using these tools to help evolve this format to keep up with the evolution of the JavaScript language.
See also the video lecture SpiderMonkey Parser API: A Standard For Structured JS Representations (opens new window) by Michael Ficarra 2014 at InfoQ.
# Simple Example
The repo crguezl/hello-ast-types (opens new window) contains examples to learn ast-types
.
The program in file index.js (opens new window) contains a simple example of usage of ast-types (opens new window):
import assert from "assert";
import {
namedTypes as n,
builders as b,
} from "ast-types";
import recast from 'recast';
2
3
4
5
6
We have imported the names of the ASTs types in n
and in b
the different
builders/constructors of AST nodes.
type: module in your package.json!
When using node.js with ES6 modules (in current versions of node)
you have to add an entry "type": "module"
to the package.json
:
➜ hello-ast-types git:(master) ✗ node --version
v16.0.0
➜ hello-ast-types git:(master) ✗ jq '.type, .dependencies' package.json
"module"
{
"ast-types": "^0.14.2"
}
2
3
4
5
6
7
let us build a identifier node and an ifStatement node:
var fooId = b.identifier("foo");
debugger;
var ifFoo = b.ifStatement(
fooId,
b.blockStatement([
b.expressionStatement(b.callExpression(fooId, []))
])
);
2
3
4
5
6
7
8
- Now the
fooId
variable contains an object like{name: 'foo', loc: null, type: 'Identifier', comments: null, optional: false, …}
and - the
ifFoo
has something like{test: {…}, consequent: {…}, alternate: null, loc: null, type: IfStatement', …}
We can use the recast
method print
to obtain the corresponding code:
console.log(recast.print(ifFoo).code);
The ifFoo
AST corresponds to the code:
➜ hello-ast-types git:(master) ✗ node index.js
if (foo) {
foo();
}
2
3
4
The family of objects n.ASTType
have check methods:
assert.ok(n.IfStatement.check(ifFoo));
assert.ok(n.Statement.check(ifFoo));
assert.ok(n.Node.check(ifFoo));
assert.ok(n.BlockStatement.check(ifFoo.consequent));
2
3
4
We can check that the call to foo()
has no arguments like that:
assert.strictEqual(
ifFoo.consequent.body[0].expression.arguments.length,
0,
);
2
3
4
Here are other checks. The check
method considers that the
node ifFoo.test
is an Identifier
and an Expression
but not a Statement
assert.strictEqual(ifFoo.test, fooId);
assert.ok(n.Expression.check(ifFoo.test));
assert.ok(n.Identifier.check(ifFoo.test));
assert.ok(!n.Statement.check(ifFoo.test));
2
3
4
# Path objects
ast-types defines methods to
- traverse the AST,
- access node fields and
- build new nodes.
ast-types wraps every AST node into a path object. Paths contain meta-information and helper methods to process AST nodes.
For example, the child-parent relationship between two nodes is not explicitly
defined. Given a plain AST node, it is not possible to traverse the tree up.
Given a path object however, the parent can be traversed to via path.parent
.
The NodePath
object passed to visitor methods is a wrapper around an AST
node, and it serves to provide access to the chain of ancestor objects
(all the way back to the root of the AST) and scope information.
In general,
path.node
refers to the wrapped node,path.parent.node
refers to the nearestNode
ancestor,path.parent.parent.node
to the grandparent, and so on.
WARNING
Note that path.node
may not be a direct property value of
path.parent.node
; but it might be the case that path.node
is
an element of an array that is a direct child of the parent node:
path.node === path.parent.node.elements[3]
# Example hello-ast-types.js
See file /crguezl/hello-ast-types/hello-ast-types.js (opens new window):
import { parse } from "espree";
import { NodePath } from "ast-types";
import deb from "./deb.js";
var programPath = new NodePath(parse("x = 1; y = 2"));
console.log(deb(programPath.node));
debugger;
var xExpressionStatement = programPath.get("body", 0);
var yExpressionStatement = programPath.get("body", 1);
var xAssignmentExpression = xExpressionStatement.get("expression");
var yAssignmentExpression = yExpressionStatement.get("expression");
console.log( // Not a direct property but an element of an array
xExpressionStatement.node === xExpressionStatement.parent.node.body[0] // true
)
console.log(deb(xAssignmentExpression.node));
console.log(deb(yAssignmentExpression.node));
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
➜ hello-ast-types git:(master) ✗ node --inspect-brk hello-ast-types.js
import cj from "color-json";
const SkippedKeys = new Set(["start", "end", "raw", "sourceType"]);
const skip = (key, value) => SkippedKeys.has(key)? undefined : value;
const deb = x => cj(JSON.stringify(x, skip, 2));
export default deb;
2
3
4
5
6
7
- Output of console.log(deb(programPath.node));
- Outputs of console.log(deb(xAssignmentExpression.node));
# path.parentPath
You should know that path.parentPath
provides
finer-grained access to the complete path of objects (not just the Node
ones) from the root of the AST:
In reality, path.parent is the grandparent of path:
path.parentPath.parentPath === path.parent
The path.parentPath
object wraps the elements
array (note that we use
.value
because the elements array is not a Node):
path.parentPath.value === path.parent.node.elements
// The path.node object is the fourth element in that array:
path.parentPath.value[3] === path.node
2
3
4
Unlike path.node
and path.value
, which are synonyms because path.node
is a Node
object,
path.parentPath.node
is distinct from
path.parentPath.value
, because the elements
array is not a
Node
.
Instead, path.parentPath.node
refers to the closest ancestor
Node
, which happens to be the same as path.parent.node
:
path.parentPath.node === path.parent.node
The path is named for its index in the elements array:
path.name === 3
Likewise, path.parentPath is named for the property by which path.parent.node refers to it:
path.parentPath.name === "elements"
Putting it all together, we can follow the chain of object references from path.parent.node all the way to path.node by accessing each property by name:
path.parent.node[path.parentPath.name][path.name] === path.node
These NodePath
objects are created during the traversal without
modifying the AST nodes themselves, so it's not a problem if the same node
appears more than once in the AST, because it will be visited with a distict NodePath
each time it appears.
Child NodePath
objects are created lazily, by calling the .get
method
of a parent NodePath
object:
// If a NodePath object for the elements array has never been created
// before, it will be created here and cached in the future:
path.get("elements").get(3).value === path.value.elements[3]
// Alternatively, you can pass multiple property names to .get instead of
// chaining multiple .get calls:
path.get("elements", 0).value === path.value.elements[0]
2
3
4
5
6
7
# nodePath.replace
NodePath
objects support a number of useful methods:
Replace one node with another node:
var fifth = path.get("elements", 4);
fifth.replace(newNode);
2
Now do some stuff that might rearrange the list, and this replacement remains safe:
fifth.replace(newerNode);
Replace the third element in an array with two new nodes:
path.get("elements", 2).replace(
b.identifier("foo"),
b.thisExpression()
);
2
3
4
Here is the code of the example replace.js (opens new window)
import recast from "recast";
import { builders as b, visit } from "ast-types";
let ast = b.functionDeclaration(
b.identifier("fn"),
[],
b.blockStatement([
b.variableDeclaration("var", [
b.variableDeclarator(b.identifier("a"), b.literal("hello world!")),
]),
])
);
console.log(recast.print(ast).code) // function fn() { var a = "hello world!"; }
visit(ast, {
visitVariableDeclaration: function (path) {
path.replace(b.returnStatement(null));
this.traverse(path);
},
});
console.log(ast.body.body[0]); // { argument: null, loc: null, type: 'ReturnStatement', comments: null }
console.log(recast.print(ast).code) // function fn() { return; }
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# nodePath.prune
Remove a node and its parent if it would leave a redundant AST node. Example:
var t = 1, y =2;
removing the t
and y
declarators results in var undefined
.
path.prune();
returns the closest parent NodePath
.
Here is a full example of prune
:
//import * as espree from "espree";
import { parse, Syntax } from "espree";
import { NodePath } from "ast-types";
const deb = x => (JSON.stringify(x, null, 2));
var programPath = new NodePath(parse("var y = 1,x = 2;"));
var variableDeclaration = programPath.get("body", 0);
// It has the shape { ... declarations: [ VariableDeclarator, VariableDeclarator], ... }
var yVariableDeclaratorPath = variableDeclaration.get("declarations", 0);
var xVariableDeclaratorPath = variableDeclaration.get("declarations", 1);
var remainingNodePath = yVariableDeclaratorPath.prune(); // returns the closest parent NodePath
remainingNodePath = xVariableDeclaratorPath.prune();
console.log(deb(programPath.node));
/* Output:
{
"type": "Program",
"start": 0,
"end": 16,
"body": [],
"sourceType": "script"
}
*/
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Other NodePath methods
Remove a node from a list of nodes:
path.get("elements", 3).replace();
Add three new nodes to the beginning of a list of nodes:
path.get("elements").unshift(a, b, c);
Remove and return the first node in a list of nodes:
path.get("elements").shift();
Push two new nodes onto the end of a list of nodes:
path.get("elements").push(d, e);
Remove and return the last node in a list of nodes:
path.get("elements").pop();
Insert a new node before/after the seventh node in a list of nodes:
var seventh = path.get("elements", 6);
seventh.insertBefore(newNode);
seventh.insertAfter(newNode);
2
3
Insert a new element at index 5 in a list of nodes:
path.get("elements").insertAt(5, newNode);
# Scope
File crguezl/hello-ast-types/scope-catch.js (opens new window)
See the AST (opens new window) for the input source.
import assert from "assert";
import { parse } from "espree";
import { namedTypes as n, NodePath,} from "ast-types";
const deb = (x) => JSON.stringify(x, null, 2);
// "catch block scope"
var catchWithVarDecl = `
function foo(e) {
try {
bar();
} catch (e) {
var f = e + 1;
return function(g) {
return e + g;
};
}
return f;
}
`;
var path = new NodePath(parse(catchWithVarDecl));
var fooPath = path.get("body", 0);
var fooScope = fooPath.scope;
var catchPath = fooPath.get("body", "body", 0, "handler");
var catchScope = catchPath.scope;
// it should not affect outer scope declarations
n.FunctionDeclaration.assert(fooScope.node);
assert.strictEqual(fooScope.declares("e"), true);
assert.strictEqual(fooScope.declares("f"), true);
assert.strictEqual(fooScope.lookup("e"), fooScope);
//it should declare only the guard parameter
n.CatchClause.assert(catchScope.node);
assert.strictEqual(catchScope.declares("e"), true);
assert.strictEqual(catchScope.declares("f"), false);
assert.strictEqual(catchScope.lookup("e"), catchScope);
assert.strictEqual(catchScope.lookup("f"), fooScope);
// it should shadow only the parameter in nested scopes
// The argument of the return inside the catch
var closurePath = catchPath.get("body", "body", 1, "argument");
var closureScope = closurePath.scope;
n.FunctionExpression.assert(closureScope.node);
assert.strictEqual(closureScope.declares("e"), false);
assert.strictEqual(closureScope.declares("f"), false);
assert.strictEqual(closureScope.declares("g"), true);
assert.strictEqual(closureScope.lookup("g"), closureScope);
assert.strictEqual(closureScope.lookup("e"), catchScope);
assert.strictEqual(closureScope.lookup("f"), fooScope);
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# Warning the use of Old JS arguments.callee
Early versions of JavaScript did not allow named function expressions, and for this reason you could not make a recursive function expression.
To write a recursive anonymous function you had to take advantage of arguments.callee (opens new window). The arguments.callee
property contains the currently executing function:
var fac = function(n) {
return !(n > 1) ? 1 : arguments.callee(n - 1) * n;
}
2
3
The 5th edition of ECMAScript (ES5) forbids its use.
The goal of this code example: you want to detect uses of this old trick to update the code.
See crguezl/hello-ast-types#visitmemberexpressionjs (opens new window) for the usage example and crguezl/hello-ast-types/visitmemberexpression.js (opens new window) for a solution
# Translating the ES6 spread operator ... to ES5
On one side, the spread syntax (...
) allows an iterable such as an array expression or string to be expanded in places where
- zero or more arguments (for function calls) or
- elements (for array literals) are expected, or
- an object expression to be expanded in places where zero or more key-value pairs (for object literals) are expected.
For instance:
function sum(x, y, z) {
return x + y + z;
}
const numbers = [1, 2, 3];
console.log(sum(...numbers));
// expected output: 6
2
3
4
5
6
On the other side it allows for a variable number of arguments that are received inside the function as an array:
function tutu(x, ...rest) {
return x + rest.length;
}
console.log(tutu(2,5,9))
// expected output: 4
2
3
4
5
The following transformation approach the translation of the spread operator so that an input like:
function tutu(x, ...rest) {
return x + rest[0];
}
2
3
is translated onto:
➜ hello-ast-types git:(master) ✗ node spread-operator.js
function tutu(x) {
var rest = Array.prototype.slice.call(arguments, 1);
return x + rest[0];
}
2
3
4
5
arguments
(opens new window) is an Array-like object (but is not an array!) accessible inside functions that contains the values of the arguments passed to that function.
The array.slice(1)
method returns a shallow copy of array
into a new array object selected from 1
to the end of the array. The original array
will not be modified.
Since arguments
is not an array, we can't use directly the slice
method and have to resort to use the JS call
method of the function objects instead.
The call(arguments, 1)
method calls Array.prototype.slice
with the value of this
set to arguments
.
See the code in the file spread-operator.js in the repo crguezl/hello-ast-types (opens new window)
AST compatibility
I have used espree
to generate the initial AST. It seems to have some incompatibilities with the
AST used by ast-types
.
We load the libs needed:
import { namedTypes as n, builders as b, visit } from "ast-types";
import recast from "recast";
import * as espree from "espree";
2
3
and we have to build an auxiliary AST for the expression Array.prototype.slice.call
,
which is deeper in Array.prototype
:
(For the sake of conciseness I have substituted the memberExpression
type by a dot .
in the figure)
We can build the auxiliary AST for the expression Array.prototype.slice.call
with this code:
var sliceExpr = b.memberExpression(
b.memberExpression( // object
b.memberExpression( // object
b.identifier("Array"), // object
b.identifier("prototype"), // property
false
),
b.identifier("slice"), // property
false
),
b.identifier("call"), // property
false
);
2
3
4
5
6
7
8
9
10
11
12
13
Explanation of the `false` values
On a memberExpression
node (and also in other nodes as well) there is a boolean property called computed
. If computed
is true
, the node corresponds to a computed (a[b]
) member expression and property is an Expression
. If computed
is false
, the node corresponds to a static (a.b
) member expression and property
has to be an Identifier
. In the AST of Array.prototype.slice.call
all the computed
properties are false
since it is a chain of static member expressions.
See the ast (opens new window) for a[b]
Let us try our translator with the following input code:
let code = `
function tutu(x, ...rest) {
return x + rest[0];
}
`;
2
3
4
5
And build the AST with Espree. Altough Espree tries hard to be compatible with Esprima and ast-types is based on Esprima it seems to me that Espree and ast-types have some incompatibilities.
let ast = espree.parse(code, {ecmaVersion: 7, loc: false});
Here is the full code for the transformation:
visit(ast, {
visitFunction(path) {
const node = path.node;
this.traverse(path);
let lastArg = node.params[node.params.length-1];
if (lastArg.type !== "RestElement") return;
// var rest = Array.prototype.slice.call(arguments, n);
const restVarDecl = b.variableDeclaration("var", [
b.variableDeclarator(
lastArg.argument,
b.callExpression(sliceExpr, [
b.identifier("arguments"),
b.literal(node.params.length)
])
)
]);
// Insert the statement 'var rest = Array.prototype.slice.call(arguments, n);'
// at the beginning of the body
path.get("body", "body").unshift(restVarDecl);
}
});
console.log(recast.print(ast).code);
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Remember that
The
path
argument passed to thevisitFunction
function is aNodePath
object whosenode
property is theFunction
node being visited.Function
nodes inast-types
stand for all kid of functions (opens new window):FunctionDeclaration
,FunctionExpression
andArrowFunctionExpression
. Therefore thevisitFunction
method is called for any node whose type is a subtype ofFunction
.The
get
method of theNodePath
object allow us to access and lazily create theNodePath
of the descendants:path.get("body", "body")
(Remember that the statements of a function are in the.body.body
of theFunction
node)The array
unshift
method allow us to insert at the beginning ot body the AST ofrestVarDecl
It's your responsibility to call
this.traverse
with someNodePath
object (usually the one passed into thevisitor
method) before thevisitor
method returns, or returnfalse
to indicate that the traversal need not continue any further down this subtree. An assertion will fail if you forget to call it.Because you can call
this.traverse
at any point in the visitor method, it's up to the programmer whether the traversal is pre-order, post-order, or both
# Checking if a function refers to this
These two rules may help to understand the semantics of this
when used in JS functions:
- Arrow functions take their value of "
this
" from the lexical scope. - Functions take their value of "
this
" from the context object.
The following example illutrates these rules:
let g = {
myVar: 'g',
gFunc: function() {
console.log(this.myVar); // g
let obj = {
myVar: 'foo',
a: () => console.log(this.myVar), // this arrow func is in the scope of gFunc
objFunc: function() {
console.log(this.myVar); // foo
this.a() // g
}
};
obj.objFunc()
},
}
g.gFunc();
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
If we are considering to rewrite some function as an arrow function, a conservative policy will be to be sure that the function does not refer in any way to the context object this
.
The traversing of the AST at crguezl/hello-ast-types/check-this-usage.js (opens new window) attempts to detect when this
(or super()
or something like super.meth()
) is used inside the body of a function.
hello-ast-types git:(master) node check-this-usage.js
function tutu() {
return this.prop+4;
}
Inside Function visitor tutu
inside thisexpression
true
----
function tutu() {
return prop+4;
}
Inside Function visitor tutu
false
----
function tutu() {
function titi() {
return this.prop+4;
}
return prop+4;
}
Inside Function visitor tutu
Inside Function visitor titi
false
----
function tutu() {
return super();
}
Inside Function visitor tutu
true
----
function tutu() {
return super.meth();
}
Inside Function visitor tutu
true
----
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47