# Phases of a Translator
En el Repo ULL-ESIT-GRADOII-PL/esprima-pegjs-jsconfeu-talk (opens new window) encontrará el material de esta lección. Clone este repo.
The examples in this repo use a couple of JavaScript compiler frameworks: Esprima (opens new window) and Espree.
Espree (opens new window) started out as a fork of Esprima (opens new window) v1.2.2, the last stable published released of Esprima before work on ECMAScript 6 began. Espree (opens new window) is now built on top of Acorn (opens new window), which has a modular architecture that allows extension of core functionality. The goal of Espree (opens new window) is to produce output that is similar to Esprima with a similar API so that it can be used in place of Esprima.
# Introducción a Espree. REPL example
Una vez clonado el repo ULL-ESIT-GRADOII-PL/esprima-pegjs-jsconfeu-talk (opens new window), instalamos las dependencias:
➜ esprima-pegjs-jsconfeu-talk git:(master) npm i
y arrancamos el bucle REPL de Node.JS:
➜ esprima-pegjs-jsconfeu-talk git:(master) node
Welcome to Node.js v14.4.0.
Type ".help" for more information.
2
3
# Espree supportedEcmaVersions
Cargamos espree:
> const espree = require('espree')
undefined
> espree.version
'7.3.1'
> espree.latestEcmaVersion
12
> espree.supportedEcmaVersions
[
3, 5, 6, 7, 8,
9, 10, 11, 12
]
2
3
4
5
6
7
8
9
10
11
# Análisis léxico
Hagamos un análisis léxico:
> espree.tokenize('answer = /* comment*/ 42', { range: true })
[
Token {
type: 'Identifier',
value: 'answer',
start: 0,
end: 6,
range: [ 0, 6 ]
},
Token {
type: 'Punctuator',
value: '=',
start: 7,
end: 8,
range: [ 7, 8 ]
},
Token {
type: 'Numeric',
value: '42',
start: 22,
end: 24,
range: [ 22, 24 ]
}
]
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Análisis sintáctico con Espree
Hagamos ahora un análisis sintáctico:
> espree.parse('const answer = 42', { tokens: true })
Uncaught [SyntaxError: The keyword 'const' is reserved
] {
index: 0,
lineNumber: 1,
column: 1
}
2
3
4
5
6
7
La versión ECMA de JS usada por defecto por espree es la 5 y esta no admite const
Especifiquemos la versión ECMA que queremos:
> espree.parse('const answer = 42',
{ ecmaVersion: espree.latestEcmaVersion,
tokens: true }
)
Node {
type: 'Program',
start: 0,
end: 17,
body: [
Node {
type: 'VariableDeclaration',
start: 0,
end: 17,
declarations: [Array],
kind: 'const'
}
],
sourceType: 'script',
tokens: [
Token { type: 'Keyword', value: 'const', start: 0, end: 5 },
Token { type: 'Identifier', value: 'answer', start: 6, end: 12 },
Token { type: 'Punctuator', value: '=', start: 13, end: 14 },
Token { type: 'Numeric', value: '42', start: 15, end: 17 }
]
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# util.inspect
Observe que el Árbol no aparece completo. El log que usa el bucle REPL de Node lo trunca en el hijo declarations (sólo nos muestra que es un array [Array] sin expandirlo) para que la salida no sea excesivamente larga.
Para que nos muestre el árbol vamos a usar el método util.inspect del módulo util
que convierte un objeto en una string:
> const util = require('util')
undefined
> console.log(
util.inspect(
espree.parse('const answer = 42',{ecmaVersion: 6}),
{depth: null}
)
)
Node {
type: 'Program',
start: 0,
end: 17,
body: [
Node {
type: 'VariableDeclaration',
start: 0,
end: 17,
declarations: [
Node {
type: 'VariableDeclarator',
start: 6,
end: 17,
id: Node {
type: 'Identifier',
start: 6,
end: 12,
name: 'answer'
},
init: Node {
type: 'Literal',
start: 15,
end: 17,
value: 42,
raw: '42'
}
}
],
kind: 'const'
}
],
sourceType: 'script'
}
undefined
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# El Objeto AST generado por el parser de Espree
Ves que el objeto está compuesto de objetos de la clase Node. Si te concentras sólo en los campos type del objeto queda
mas evidente como el objeto describe la jerarquía AST construída para la frase answer = 42. En las etiquetas de las aristas he puesto los nombres de los atributos y el tipo ([Node] para indicar array de objetos Node)
# Tipos de Nodos y nombres de los hijos
Navegar en el árbol AST es complicado.
El atributo espree.visitorKeys nos proporciona la lista de nodos y los nombres de los atributos de sus hijos
> const typesOfNodes = Object.keys(espree.VisitorKeys)
undefined
> typesOfNodes.slice(0,4)
[
'AssignmentExpression',
'AssignmentPattern',
'ArrayExpression',
'ArrayPattern'
]
2
3
4
5
6
7
8
9
El valor nos da los nombres de los atributos que define los hijos:
> espree.VisitorKeys.AssignmentExpression
[ 'left', 'right' ]
> espree.VisitorKeys.IfStatement
[ 'test', 'consequent', 'alternate' ]
2
3
4
# El web site ASTExplorer.net
Usando la herramienta web https://astexplorer.net (opens new window) podemos navegar el AST producido por varios compiladores JS:
# Traversing the AST
# Traversing with estraverse
The file idgrep.js (opens new window) is a very simple example of using Esprima to do static analysis on JavaScript code.
It provides a function idgrep that finds the appearances of identifiers matching a search string inside the input code.
#!/usr/bin/env node
const fs = require("fs");
const esprima = require("espree");
const program = require("commander");
const { version, description } = require("./package.json");
const estraverse = require("estraverse");
const idgrep = function (pattern, code, filename) {
const lines = code.split("\n");
if (/^#!/.test(lines[0])) code = code.replace(/^.*/, ""); // Avoid line "#!/usr/bin/env node"
const ast = esprima.parse(code, {
ecmaVersion: 6,
loc: true,
range: true,
});
estraverse.traverse(ast, {
enter: function (node, parent) {
if (node.type === "Identifier" && pattern.test(node.name)) {
let loc = node.loc.start;
let line = loc.line - 1;
console.log(
`file ${filename}: line ${loc.line}: col: ${loc.column} text: ${lines[line]}`
);
}
},
});
};
program
.version(version)
.description(description)
.option("-p --pattern [regexp]", "regexp to use in the search", "hack")
.usage("[options] <filename>");
program.parse(process.argv);
const options = program.opts();
const pattern = new RegExp(options.pattern);
if (program.args.length == 0) program.help();
for (const inputFilename of program.args) {
try {
fs.readFile(inputFilename, "utf8", (err, input) => {
debugger;
if (err) throw `Error reading '${inputFilename}':${err}`;
idgrep(pattern, input, inputFilename);
});
} catch (e) {
console.log(`Errores! ${e}`);
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Examples of executions.
With two input files:
➜ (private) ✗ ./idgrep.js espree-logging-solution.js hello-ast-espree.js -p ast
file espree-logging-solution.js: line 13: col: 10 text: estraverse.traverse(ast, {
file espree-logging-solution.js: line 14: col: 24 text: enter: function(node) {
file espree-logging-solution.js: line 23: col: 30 text: }
file hello-ast-espree.js: line 3: col: 6 text: function getAnswer() {
file hello-ast-espree.js: line 8: col: 25 text: undefined
2
3
4
5
6
With a single file and testing hacky.js (opens new window) (Observe how the appearances of hack inside the comment or the string aren't shown)
➜ esprima-pegjs-jsconfeu-talk git:(private) ✗ ./idgrep.js -p hac hacky.js
file hacky.js: line 2: col: 6 text: /* This hack does not count */
file hacky.js: line 4: col: 8 text: let another = 9;
2
3
When the file doesn't exist:
➜ esprima-pegjs-jsconfeu-talk git:(private) ✗ ./idgrep.js fhjdfjhdsj
/Users/casianorodriguezleon/campus-virtual/shared/esprima-pegjs-jsconfeu-talk-labs/esprima-pegjs-jsconfeu-talk/idgrep.js:45
if (err) throw `Error reading '${inputFilename}':${err}`;
^
Error reading 'fhjdfjhdsj':Error: ENOENT: no such file or directory, open 'fhjdfjhdsj'
(Use `node --trace-uncaught ...` to show where the exception was thrown)
2
3
4
5
6
7
# How to build a Parser
# First Steps on Building a Parser with Jison
See the examples in the repo crguezl/hello-jison (opens new window)
This repo (opens new window) contains two examples:
- The first one is a simple interpreter for infix arithmetic expressions with the minus operator only
- See files
minus.jison,minus.landuse_minus.js
- See files
- The second is a translator from infix arithmetic expressions to JavaScript
minus-ast.jisonbuilds a Espree compatible AST usingminus.land the helpers inast-build.js- The lexical analyzer
minus.lis reused
- The
ast-*.jsonfiles contain examples of Espree ASTs
# Calculator example with PEG.js from the talk Parsing, Compiling, and Static Metaprogramming
altjs.js (opens new window) is the code for the "AltJS language in 5 minutes" section presented in the second half of the talk Parsing, Compiling, and Static Metaprogramming (opens new window) by Patrick Dubroy
# References
- Repo ULL-ESIT-GRADOII-PL/esprima-pegjs-jsconfeu-talk (opens new window)
- Simple examples of AST traversal and transformation crguezl/ast-traversal (opens new window)
- crguezl/hello-jison (opens new window)
- Espree (opens new window)
- astexplorer.net demo
- idgrep.js (opens new window)
- Master the Art of the AST
- Awesome AST (opens new window) A repo like
(opens new window)
- ESQuery is a library for querying the AST output by Esprima for patterns of syntax using a CSS style selector system.
- esquery (opens new window) repo at GitHub
- Check out the Demo (opens new window)