# Phases of a Translator
En el Repo ULL-ESIT-GRADOII-PL/esprima-pegjs-jsconfeu-talk (opens new window) encontrará el material de esta lección. Clone este repo.
The examples in this repo use a couple of JavaScript compiler frameworks: Esprima (opens new window) and Espree.
Espree (opens new window) started out as a fork of Esprima (opens new window) v1.2.2, the last stable published released of Esprima before work on ECMAScript 6 began. Espree (opens new window) is now built on top of Acorn (opens new window), which has a modular architecture that allows extension of core functionality. The goal of Espree (opens new window) is to produce output that is similar to Esprima with a similar API so that it can be used in place of Esprima.
# Introducción a Espree. REPL example
Una vez clonado el repo ULL-ESIT-GRADOII-PL/esprima-pegjs-jsconfeu-talk (opens new window), instalamos las dependencias:
➜ esprima-pegjs-jsconfeu-talk git:(master) npm i
y arrancamos el bucle REPL de Node.JS:
➜ esprima-pegjs-jsconfeu-talk git:(master) node
Welcome to Node.js v14.4.0.
Type ".help" for more information.
2
3
# Espree supportedEcmaVersions
Cargamos espree
:
> const espree = require('espree')
undefined
> espree.version
'7.3.1'
> espree.latestEcmaVersion
12
> espree.supportedEcmaVersions
[
3, 5, 6, 7, 8,
9, 10, 11, 12
]
2
3
4
5
6
7
8
9
10
11
# Análisis léxico
Hagamos un análisis léxico:
> espree.tokenize('answer = /* comment*/ 42', { range: true })
[
Token {
type: 'Identifier',
value: 'answer',
start: 0,
end: 6,
range: [ 0, 6 ]
},
Token {
type: 'Punctuator',
value: '=',
start: 7,
end: 8,
range: [ 7, 8 ]
},
Token {
type: 'Numeric',
value: '42',
start: 22,
end: 24,
range: [ 22, 24 ]
}
]
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# Análisis sintáctico con Espree
Hagamos ahora un análisis sintáctico:
> espree.parse('const answer = 42', { tokens: true })
Uncaught [SyntaxError: The keyword 'const' is reserved
] {
index: 0,
lineNumber: 1,
column: 1
}
2
3
4
5
6
7
La versión ECMA de JS usada por defecto por espree
es la 5 y esta no admite const
Especifiquemos la versión ECMA que queremos:
> espree.parse('const answer = 42',
{ ecmaVersion: espree.latestEcmaVersion,
tokens: true }
)
Node {
type: 'Program',
start: 0,
end: 17,
body: [
Node {
type: 'VariableDeclaration',
start: 0,
end: 17,
declarations: [Array],
kind: 'const'
}
],
sourceType: 'script',
tokens: [
Token { type: 'Keyword', value: 'const', start: 0, end: 5 },
Token { type: 'Identifier', value: 'answer', start: 6, end: 12 },
Token { type: 'Punctuator', value: '=', start: 13, end: 14 },
Token { type: 'Numeric', value: '42', start: 15, end: 17 }
]
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# util.inspect
Observe que el Árbol no aparece completo. El log que usa el bucle REPL de Node lo trunca en el hijo declarations
(sólo nos muestra que es un array [Array]
sin expandirlo) para que la salida no sea excesivamente larga.
Para que nos muestre el árbol vamos a usar el método util.inspect
del módulo util
que convierte un objeto en una string:
> const util = require('util')
undefined
> console.log(
util.inspect(
espree.parse('const answer = 42',{ecmaVersion: 6}),
{depth: null}
)
)
Node {
type: 'Program',
start: 0,
end: 17,
body: [
Node {
type: 'VariableDeclaration',
start: 0,
end: 17,
declarations: [
Node {
type: 'VariableDeclarator',
start: 6,
end: 17,
id: Node {
type: 'Identifier',
start: 6,
end: 12,
name: 'answer'
},
init: Node {
type: 'Literal',
start: 15,
end: 17,
value: 42,
raw: '42'
}
}
],
kind: 'const'
}
],
sourceType: 'script'
}
undefined
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# El Objeto AST generado por el parser de Espree
Ves que el objeto está compuesto de objetos de la clase Node
. Si te concentras sólo en los campos type
del objeto queda
mas evidente como el objeto describe la jerarquía AST construída para la frase answer = 42
. En las etiquetas de las aristas he puesto los nombres de los atributos y el tipo ([Node]
para indicar array de objetos Node
)
# Tipos de Nodos y nombres de los hijos
Navegar en el árbol AST es complicado.
El atributo espree.visitorKeys
nos proporciona la lista de nodos y los nombres de los atributos de sus hijos
> const typesOfNodes = Object.keys(espree.VisitorKeys)
undefined
> typesOfNodes.slice(0,4)
[
'AssignmentExpression',
'AssignmentPattern',
'ArrayExpression',
'ArrayPattern'
]
2
3
4
5
6
7
8
9
El valor nos da los nombres de los atributos que define los hijos:
> espree.VisitorKeys.AssignmentExpression
[ 'left', 'right' ]
> espree.VisitorKeys.IfStatement
[ 'test', 'consequent', 'alternate' ]
2
3
4
# El web site ASTExplorer.net
Usando la herramienta web https://astexplorer.net (opens new window) podemos navegar el AST producido por varios compiladores JS:
# Traversing the AST
# Traversing with estraverse
The file idgrep.js (opens new window) is a very simple example of using Esprima to do static analysis on JavaScript code.
It provides a function idgrep
that finds the appearances of identifiers matching a search string inside the input code.
#!/usr/bin/env node
const fs = require("fs");
const esprima = require("espree");
const program = require("commander");
const { version, description } = require("./package.json");
const estraverse = require("estraverse");
const idgrep = function (pattern, code, filename) {
const lines = code.split("\n");
if (/^#!/.test(lines[0])) code = code.replace(/^.*/, ""); // Avoid line "#!/usr/bin/env node"
const ast = esprima.parse(code, {
ecmaVersion: 6,
loc: true,
range: true,
});
estraverse.traverse(ast, {
enter: function (node, parent) {
if (node.type === "Identifier" && pattern.test(node.name)) {
let loc = node.loc.start;
let line = loc.line - 1;
console.log(
`file ${filename}: line ${loc.line}: col: ${loc.column} text: ${lines[line]}`
);
}
},
});
};
program
.version(version)
.description(description)
.option("-p --pattern [regexp]", "regexp to use in the search", "hack")
.usage("[options] <filename>");
program.parse(process.argv);
const options = program.opts();
const pattern = new RegExp(options.pattern);
if (program.args.length == 0) program.help();
for (const inputFilename of program.args) {
try {
fs.readFile(inputFilename, "utf8", (err, input) => {
debugger;
if (err) throw `Error reading '${inputFilename}':${err}`;
idgrep(pattern, input, inputFilename);
});
} catch (e) {
console.log(`Errores! ${e}`);
}
}
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Examples of executions.
With two input files:
➜ (private) ✗ ./idgrep.js espree-logging-solution.js hello-ast-espree.js -p ast
file espree-logging-solution.js: line 13: col: 10 text: estraverse.traverse(ast, {
file espree-logging-solution.js: line 14: col: 24 text: enter: function(node) {
file espree-logging-solution.js: line 23: col: 30 text: }
file hello-ast-espree.js: line 3: col: 6 text: function getAnswer() {
file hello-ast-espree.js: line 8: col: 25 text: undefined
2
3
4
5
6
With a single file and testing hacky.js (opens new window) (Observe how the appearances of hack
inside the comment or the string aren't shown)
➜ esprima-pegjs-jsconfeu-talk git:(private) ✗ ./idgrep.js -p hac hacky.js
file hacky.js: line 2: col: 6 text: /* This hack does not count */
file hacky.js: line 4: col: 8 text: let another = 9;
2
3
When the file doesn't exist:
➜ esprima-pegjs-jsconfeu-talk git:(private) ✗ ./idgrep.js fhjdfjhdsj
/Users/casianorodriguezleon/campus-virtual/shared/esprima-pegjs-jsconfeu-talk-labs/esprima-pegjs-jsconfeu-talk/idgrep.js:45
if (err) throw `Error reading '${inputFilename}':${err}`;
^
Error reading 'fhjdfjhdsj':Error: ENOENT: no such file or directory, open 'fhjdfjhdsj'
(Use `node --trace-uncaught ...` to show where the exception was thrown)
2
3
4
5
6
7
# How to build a Parser
# First Steps on Building a Parser with Jison
See the examples in the repo crguezl/hello-jison (opens new window)
This repo (opens new window) contains two examples:
- The first one is a simple interpreter for infix arithmetic expressions with the minus operator only
- See files
minus.jison
,minus.l
anduse_minus.js
- See files
- The second is a translator from infix arithmetic expressions to JavaScript
minus-ast.jison
builds a Espree compatible AST usingminus.l
and the helpers inast-build.js
- The lexical analyzer
minus.l
is reused
- The
ast-*.json
files contain examples of Espree ASTs
# Calculator example with PEG.js from the talk Parsing, Compiling, and Static Metaprogramming
altjs.js (opens new window) is the code for the "AltJS language in 5 minutes" section presented in the second half of the talk Parsing, Compiling, and Static Metaprogramming (opens new window) by Patrick Dubroy
# References
- Repo ULL-ESIT-GRADOII-PL/esprima-pegjs-jsconfeu-talk (opens new window)
- Simple examples of AST traversal and transformation crguezl/ast-traversal (opens new window)
- crguezl/hello-jison (opens new window)
- Espree (opens new window)
- astexplorer.net demo
- idgrep.js (opens new window)
- Master the Art of the AST
- Awesome AST (opens new window) A repo like (opens new window)
- ESQuery is a library for querying the AST output by Esprima for patterns of syntax using a CSS style selector system.
- esquery (opens new window) repo at GitHub
- Check out the Demo (opens new window)