isophonics-drupal-site: vendor/nikic/php-parser/doc/2_Usage_of_basic

annotate vendor/nikic/php-parser/doc/2_Usage_of_basic_components.markdown @ 2:92f882872392

Trusted hosts, + remove migration modules

author	Chris Cannam
date	Tue, 05 Dec 2017 09:26:43 +0000
parents	4c8ae668cc8c
children	5fb285c0d0e3

rev	line source
Chris@0	1 Usage of basic components
Chris@0	2 =========================
Chris@0	3
Chris@0	4 This document explains how to use the parser, the pretty printer and the node traverser.
Chris@0	5
Chris@0	6 Bootstrapping
Chris@0	7 -------------
Chris@0	8
Chris@0	9 To bootstrap the library, include the autoloader generated by composer:
Chris@0	10
Chris@0	11 ```php
Chris@0	12 require 'path/to/vendor/autoload.php';
Chris@0	13 ```
Chris@0	14
Chris@0	15 Additionally you may want to set the `xdebug.max_nesting_level` ini option to a higher value:
Chris@0	16
Chris@0	17 ```php
Chris@0	18 ini_set('xdebug.max_nesting_level', 3000);
Chris@0	19 ```
Chris@0	20
Chris@0	21 This ensures that there will be no errors when traversing highly nested node trees. However, it is
Chris@0	22 preferable to disable XDebug completely, as it can easily make this library more than five times
Chris@0	23 slower.
Chris@0	24
Chris@0	25 Parsing
Chris@0	26 -------
Chris@0	27
Chris@0	28 In order to parse code, you first have to create a parser instance:
Chris@0	29
Chris@0	30 ```php
Chris@0	31 use PhpParser\ParserFactory;
Chris@0	32 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
Chris@0	33 ```
Chris@0	34
Chris@0	35 The factory accepts a kind argument, that determines how different PHP versions are treated:
Chris@0	36
Chris@0	37 Kind \| Behavior
Chris@0	38 -----\|---------
Chris@0	39 `ParserFactory::PREFER_PHP7` \| Try to parse code as PHP 7. If this fails, try to parse it as PHP 5.
Chris@0	40 `ParserFactory::PREFER_PHP5` \| Try to parse code as PHP 5. If this fails, try to parse it as PHP 7.
Chris@0	41 `ParserFactory::ONLY_PHP7` \| Parse code as PHP 7.
Chris@0	42 `ParserFactory::ONLY_PHP5` \| Parse code as PHP 5.
Chris@0	43
Chris@0	44 Unless you have strong reason to use something else, `PREFER_PHP7` is a reasonable default.
Chris@0	45
Chris@0	46 The `create()` method optionally accepts a `Lexer` instance as the second argument. Some use cases
Chris@0	47 that require customized lexers are discussed in the [lexer documentation](component/Lexer.markdown).
Chris@0	48
Chris@0	49 Subsequently you can pass PHP code (including the opening `<?php` tag) to the `parse` method in order to
Chris@0	50 create a syntax tree. If a syntax error is encountered, an `PhpParser\Error` exception will be thrown:
Chris@0	51
Chris@0	52 ```php
Chris@0	53 use PhpParser\Error;
Chris@0	54 use PhpParser\ParserFactory;
Chris@0	55
Chris@0	56 $code = '<?php // some code';
Chris@0	57 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
Chris@0	58
Chris@0	59 try {
Chris@0	60 $stmts = $parser->parse($code);
Chris@0	61 // $stmts is an array of statement nodes
Chris@0	62 } catch (Error $e) {
Chris@0	63 echo 'Parse Error: ', $e->getMessage();
Chris@0	64 }
Chris@0	65 ```
Chris@0	66
Chris@0	67 A parser instance can be reused to parse multiple files.
Chris@0	68
Chris@0	69 Node tree
Chris@0	70 ---------
Chris@0	71
Chris@0	72 If you use the above code with `$code = "<?php echo 'Hi ', hi\\getTarget();"` the parser will
Chris@0	73 generate a node tree looking like this:
Chris@0	74
Chris@0	75 ```
Chris@0	76 array(
Chris@0	77 0: Stmt_Echo(
Chris@0	78 exprs: array(
Chris@0	79 0: Scalar_String(
Chris@0	80 value: Hi
Chris@0	81 )
Chris@0	82 1: Expr_FuncCall(
Chris@0	83 name: Name(
Chris@0	84 parts: array(
Chris@0	85 0: hi
Chris@0	86 1: getTarget
Chris@0	87 )
Chris@0	88 )
Chris@0	89 args: array(
Chris@0	90 )
Chris@0	91 )
Chris@0	92 )
Chris@0	93 )
Chris@0	94 )
Chris@0	95 ```
Chris@0	96
Chris@0	97 Thus `$stmts` will contain an array with only one node, with this node being an instance of
Chris@0	98 `PhpParser\Node\Stmt\Echo_`.
Chris@0	99
Chris@0	100 As PHP is a large language there are approximately 140 different nodes. In order to make work
Chris@0	101 with them easier they are grouped into three categories:
Chris@0	102
Chris@0	103 * `PhpParser\Node\Stmt`s are statement nodes, i.e. language constructs that do not return
Chris@0	104 a value and can not occur in an expression. For example a class definition is a statement.
Chris@0	105 It doesn't return a value and you can't write something like `func(class A {});`.
Chris@0	106 * `PhpParser\Node\Expr`s are expression nodes, i.e. language constructs that return a value
Chris@0	107 and thus can occur in other expressions. Examples of expressions are `$var`
Chris@0	108 (`PhpParser\Node\Expr\Variable`) and `func()` (`PhpParser\Node\Expr\FuncCall`).
Chris@0	109 * `PhpParser\Node\Scalar`s are nodes representing scalar values, like `'string'`
Chris@0	110 (`PhpParser\Node\Scalar\String_`), `0` (`PhpParser\Node\Scalar\LNumber`) or magic constants
Chris@0	111 like `__FILE__` (`PhpParser\Node\Scalar\MagicConst\File`). All `PhpParser\Node\Scalar`s extend
Chris@0	112 `PhpParser\Node\Expr`, as scalars are expressions, too.
Chris@0	113 * There are some nodes not in either of these groups, for example names (`PhpParser\Node\Name`)
Chris@0	114 and call arguments (`PhpParser\Node\Arg`).
Chris@0	115
Chris@0	116 Some node class names have a trailing `_`. This is used whenever the class name would otherwise clash
Chris@0	117 with a PHP keyword.
Chris@0	118
Chris@0	119 Every node has a (possibly zero) number of subnodes. You can access subnodes by writing
Chris@0	120 `$node->subNodeName`. The `Stmt\Echo_` node has only one subnode `exprs`. So in order to access it
Chris@0	121 in the above example you would write `$stmts[0]->exprs`. If you wanted to access the name of the function
Chris@0	122 call, you would write `$stmts[0]->exprs[1]->name`.
Chris@0	123
Chris@0	124 All nodes also define a `getType()` method that returns the node type. The type is the class name
Chris@0	125 without the `PhpParser\Node\` prefix and `\` replaced with `_`. It also does not contain a trailing
Chris@0	126 `_` for reserved-keyword class names.
Chris@0	127
Chris@0	128 It is possible to associate custom metadata with a node using the `setAttribute()` method. This data
Chris@0	129 can then be retrieved using `hasAttribute()`, `getAttribute()` and `getAttributes()`.
Chris@0	130
Chris@0	131 By default the lexer adds the `startLine`, `endLine` and `comments` attributes. `comments` is an array
Chris@0	132 of `PhpParser\Comment[\Doc]` instances.
Chris@0	133
Chris@0	134 The start line can also be accessed using `getLine()`/`setLine()` (instead of `getAttribute('startLine')`).
Chris@0	135 The last doc comment from the `comments` attribute can be obtained using `getDocComment()`.
Chris@0	136
Chris@0	137 Pretty printer
Chris@0	138 --------------
Chris@0	139
Chris@0	140 The pretty printer component compiles the AST back to PHP code. As the parser does not retain formatting
Chris@0	141 information the formatting is done using a specified scheme. Currently there is only one scheme available,
Chris@0	142 namely `PhpParser\PrettyPrinter\Standard`.
Chris@0	143
Chris@0	144 ```php
Chris@0	145 use PhpParser\Error;
Chris@0	146 use PhpParser\ParserFactory;
Chris@0	147 use PhpParser\PrettyPrinter;
Chris@0	148
Chris@0	149 $code = "<?php echo 'Hi ', hi\\getTarget();";
Chris@0	150
Chris@0	151 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
Chris@0	152 $prettyPrinter = new PrettyPrinter\Standard;
Chris@0	153
Chris@0	154 try {
Chris@0	155 // parse
Chris@0	156 $stmts = $parser->parse($code);
Chris@0	157
Chris@0	158 // change
Chris@0	159 $stmts[0] // the echo statement
Chris@0	160 ->exprs // sub expressions
Chris@0	161 [0] // the first of them (the string node)
Chris@0	162 ->value // it's value, i.e. 'Hi '
Chris@0	163 = 'Hello '; // change to 'Hello '
Chris@0	164
Chris@0	165 // pretty print
Chris@0	166 $code = $prettyPrinter->prettyPrint($stmts);
Chris@0	167
Chris@0	168 echo $code;
Chris@0	169 } catch (Error $e) {
Chris@0	170 echo 'Parse Error: ', $e->getMessage();
Chris@0	171 }
Chris@0	172 ```
Chris@0	173
Chris@0	174 The above code will output:
Chris@0	175
Chris@0	176 <?php echo 'Hello ', hi\getTarget();
Chris@0	177
Chris@0	178 As you can see the source code was first parsed using `PhpParser\Parser->parse()`, then changed and then
Chris@0	179 again converted to code using `PhpParser\PrettyPrinter\Standard->prettyPrint()`.
Chris@0	180
Chris@0	181 The `prettyPrint()` method pretty prints a statements array. It is also possible to pretty print only a
Chris@0	182 single expression using `prettyPrintExpr()`.
Chris@0	183
Chris@0	184 The `prettyPrintFile()` method can be used to print an entire file. This will include the opening `<?php` tag
Chris@0	185 and handle inline HTML as the first/last statement more gracefully.
Chris@0	186
Chris@0	187 Node traversation
Chris@0	188 -----------------
Chris@0	189
Chris@0	190 The above pretty printing example used the fact that the source code was known and thus it was easy to
Chris@0	191 write code that accesses a certain part of a node tree and changes it. Normally this is not the case.
Chris@0	192 Usually you want to change / analyze code in a generic way, where you don't know how the node tree is
Chris@0	193 going to look like.
Chris@0	194
Chris@0	195 For this purpose the parser provides a component for traversing and visiting the node tree. The basic
Chris@0	196 structure of a program using this `PhpParser\NodeTraverser` looks like this:
Chris@0	197
Chris@0	198 ```php
Chris@0	199 use PhpParser\NodeTraverser;
Chris@0	200 use PhpParser\ParserFactory;
Chris@0	201 use PhpParser\PrettyPrinter;
Chris@0	202
Chris@0	203 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
Chris@0	204 $traverser = new NodeTraverser;
Chris@0	205 $prettyPrinter = new PrettyPrinter\Standard;
Chris@0	206
Chris@0	207 // add your visitor
Chris@0	208 $traverser->addVisitor(new MyNodeVisitor);
Chris@0	209
Chris@0	210 try {
Chris@0	211 $code = file_get_contents($fileName);
Chris@0	212
Chris@0	213 // parse
Chris@0	214 $stmts = $parser->parse($code);
Chris@0	215
Chris@0	216 // traverse
Chris@0	217 $stmts = $traverser->traverse($stmts);
Chris@0	218
Chris@0	219 // pretty print
Chris@0	220 $code = $prettyPrinter->prettyPrintFile($stmts);
Chris@0	221
Chris@0	222 echo $code;
Chris@0	223 } catch (PhpParser\Error $e) {
Chris@0	224 echo 'Parse Error: ', $e->getMessage();
Chris@0	225 }
Chris@0	226 ```
Chris@0	227
Chris@0	228 The corresponding node visitor might look like this:
Chris@0	229
Chris@0	230 ```php
Chris@0	231 use PhpParser\Node;
Chris@0	232 use PhpParser\NodeVisitorAbstract;
Chris@0	233
Chris@0	234 class MyNodeVisitor extends NodeVisitorAbstract
Chris@0	235 {
Chris@0	236 public function leaveNode(Node $node) {
Chris@0	237 if ($node instanceof Node\Scalar\String_) {
Chris@0	238 $node->value = 'foo';
Chris@0	239 }
Chris@0	240 }
Chris@0	241 }
Chris@0	242 ```
Chris@0	243
Chris@0	244 The above node visitor would change all string literals in the program to `'foo'`.
Chris@0	245
Chris@0	246 All visitors must implement the `PhpParser\NodeVisitor` interface, which defines the following four
Chris@0	247 methods:
Chris@0	248
Chris@0	249 ```php
Chris@0	250 public function beforeTraverse(array $nodes);
Chris@0	251 public function enterNode(\PhpParser\Node $node);
Chris@0	252 public function leaveNode(\PhpParser\Node $node);
Chris@0	253 public function afterTraverse(array $nodes);
Chris@0	254 ```
Chris@0	255
Chris@0	256 The `beforeTraverse()` method is called once before the traversal begins and is passed the nodes the
Chris@0	257 traverser was called with. This method can be used for resetting values before traversation or
Chris@0	258 preparing the tree for traversal.
Chris@0	259
Chris@0	260 The `afterTraverse()` method is similar to the `beforeTraverse()` method, with the only difference that
Chris@0	261 it is called once after the traversal.
Chris@0	262
Chris@0	263 The `enterNode()` and `leaveNode()` methods are called on every node, the former when it is entered,
Chris@0	264 i.e. before its subnodes are traversed, the latter when it is left.
Chris@0	265
Chris@0	266 All four methods can either return the changed node or not return at all (i.e. `null`) in which
Chris@0	267 case the current node is not changed.
Chris@0	268
Chris@0	269 The `enterNode()` method can additionally return the value `NodeTraverser::DONT_TRAVERSE_CHILDREN`,
Chris@0	270 which instructs the traverser to skip all children of the current node.
Chris@0	271
Chris@0	272 The `leaveNode()` method can additionally return the value `NodeTraverser::REMOVE_NODE`, in which
Chris@0	273 case the current node will be removed from the parent array. Furthermore it is possible to return
Chris@0	274 an array of nodes, which will be merged into the parent array at the offset of the current node.
Chris@0	275 I.e. if in `array(A, B, C)` the node `B` should be replaced with `array(X, Y, Z)` the result will
Chris@0	276 be `array(A, X, Y, Z, C)`.
Chris@0	277
Chris@0	278 Instead of manually implementing the `NodeVisitor` interface you can also extend the `NodeVisitorAbstract`
Chris@0	279 class, which will define empty default implementations for all the above methods.
Chris@0	280
Chris@0	281 The NameResolver node visitor
Chris@0	282 -----------------------------
Chris@0	283
Chris@0	284 One visitor is already bundled with the package: `PhpParser\NodeVisitor\NameResolver`. This visitor
Chris@0	285 helps you work with namespaced code by trying to resolve most names to fully qualified ones.
Chris@0	286
Chris@0	287 For example, consider the following code:
Chris@0	288
Chris@0	289 use A as B;
Chris@0	290 new B\C();
Chris@0	291
Chris@0	292 In order to know that `B\C` really is `A\C` you would need to track aliases and namespaces yourself.
Chris@0	293 The `NameResolver` takes care of that and resolves names as far as possible.
Chris@0	294
Chris@0	295 After running it most names will be fully qualified. The only names that will stay unqualified are
Chris@0	296 unqualified function and constant names. These are resolved at runtime and thus the visitor can't
Chris@0	297 know which function they are referring to. In most cases this is a non-issue as the global functions
Chris@0	298 are meant.
Chris@0	299
Chris@0	300 Also the `NameResolver` adds a `namespacedName` subnode to class, function and constant declarations
Chris@0	301 that contains the namespaced name instead of only the shortname that is available via `name`.
Chris@0	302
Chris@0	303 Example: Converting namespaced code to pseudo namespaces
Chris@0	304 --------------------------------------------------------
Chris@0	305
Chris@0	306 A small example to understand the concept: We want to convert namespaced code to pseudo namespaces
Chris@0	307 so it works on 5.2, i.e. names like `A\\B` should be converted to `A_B`. Note that such conversions
Chris@0	308 are fairly complicated if you take PHP's dynamic features into account, so our conversion will
Chris@0	309 assume that no dynamic features are used.
Chris@0	310
Chris@0	311 We start off with the following base code:
Chris@0	312
Chris@0	313 ```php
Chris@0	314 use PhpParser\ParserFactory;
Chris@0	315 use PhpParser\PrettyPrinter;
Chris@0	316 use PhpParser\NodeTraverser;
Chris@0	317 use PhpParser\NodeVisitor\NameResolver;
Chris@0	318
Chris@0	319 $inDir = '/some/path';
Chris@0	320 $outDir = '/some/other/path';
Chris@0	321
Chris@0	322 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7);
Chris@0	323 $traverser = new NodeTraverser;
Chris@0	324 $prettyPrinter = new PrettyPrinter\Standard;
Chris@0	325
Chris@0	326 $traverser->addVisitor(new NameResolver); // we will need resolved names
Chris@0	327 $traverser->addVisitor(new NamespaceConverter); // our own node visitor
Chris@0	328
Chris@0	329 // iterate over all .php files in the directory
Chris@0	330 $files = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($inDir));
Chris@0	331 $files = new \RegexIterator($files, '/\.php$/');
Chris@0	332
Chris@0	333 foreach ($files as $file) {
Chris@0	334 try {
Chris@0	335 // read the file that should be converted
Chris@0	336 $code = file_get_contents($file);
Chris@0	337
Chris@0	338 // parse
Chris@0	339 $stmts = $parser->parse($code);
Chris@0	340
Chris@0	341 // traverse
Chris@0	342 $stmts = $traverser->traverse($stmts);
Chris@0	343
Chris@0	344 // pretty print
Chris@0	345 $code = $prettyPrinter->prettyPrintFile($stmts);
Chris@0	346
Chris@0	347 // write the converted file to the target directory
Chris@0	348 file_put_contents(
Chris@0	349 substr_replace($file->getPathname(), $outDir, 0, strlen($inDir)),
Chris@0	350 $code
Chris@0	351 );
Chris@0	352 } catch (PhpParser\Error $e) {
Chris@0	353 echo 'Parse Error: ', $e->getMessage();
Chris@0	354 }
Chris@0	355 }
Chris@0	356 ```
Chris@0	357
Chris@0	358 Now lets start with the main code, the `NodeVisitor\NamespaceConverter`. One thing it needs to do
Chris@0	359 is convert `A\\B` style names to `A_B` style ones.
Chris@0	360
Chris@0	361 ```php
Chris@0	362 use PhpParser\Node;
Chris@0	363
Chris@0	364 class NamespaceConverter extends \PhpParser\NodeVisitorAbstract
Chris@0	365 {
Chris@0	366 public function leaveNode(Node $node) {
Chris@0	367 if ($node instanceof Node\Name) {
Chris@0	368 return new Node\Name($node->toString('_'));
Chris@0	369 }
Chris@0	370 }
Chris@0	371 }
Chris@0	372 ```
Chris@0	373
Chris@0	374 The above code profits from the fact that the `NameResolver` already resolved all names as far as
Chris@0	375 possible, so we don't need to do that. We only need to create a string with the name parts separated
Chris@0	376 by underscores instead of backslashes. This is what `$node->toString('_')` does. (If you want to
Chris@0	377 create a name with backslashes either write `$node->toString()` or `(string) $node`.) Then we create
Chris@0	378 a new name from the string and return it. Returning a new node replaces the old node.
Chris@0	379
Chris@0	380 Another thing we need to do is change the class/function/const declarations. Currently they contain
Chris@0	381 only the shortname (i.e. the last part of the name), but they need to contain the complete name including
Chris@0	382 the namespace prefix:
Chris@0	383
Chris@0	384 ```php
Chris@0	385 use PhpParser\Node;
Chris@0	386 use PhpParser\Node\Stmt;
Chris@0	387
Chris@0	388 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract
Chris@0	389 {
Chris@0	390 public function leaveNode(Node $node) {
Chris@0	391 if ($node instanceof Node\Name) {
Chris@0	392 return new Node\Name($node->toString('_'));
Chris@0	393 } elseif ($node instanceof Stmt\Class_
Chris@0	394 \|\| $node instanceof Stmt\Interface_
Chris@0	395 \|\| $node instanceof Stmt\Function_) {
Chris@0	396 $node->name = $node->namespacedName->toString('_');
Chris@0	397 } elseif ($node instanceof Stmt\Const_) {
Chris@0	398 foreach ($node->consts as $const) {
Chris@0	399 $const->name = $const->namespacedName->toString('_');
Chris@0	400 }
Chris@0	401 }
Chris@0	402 }
Chris@0	403 }
Chris@0	404 ```
Chris@0	405
Chris@0	406 There is not much more to it than converting the namespaced name to string with `_` as separator.
Chris@0	407
Chris@0	408 The last thing we need to do is remove the `namespace` and `use` statements:
Chris@0	409
Chris@0	410 ```php
Chris@0	411 use PhpParser\Node;
Chris@0	412 use PhpParser\Node\Stmt;
Chris@0	413
Chris@0	414 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract
Chris@0	415 {
Chris@0	416 public function leaveNode(Node $node) {
Chris@0	417 if ($node instanceof Node\Name) {
Chris@0	418 return new Node\Name($node->toString('_'));
Chris@0	419 } elseif ($node instanceof Stmt\Class_
Chris@0	420 \|\| $node instanceof Stmt\Interface_
Chris@0	421 \|\| $node instanceof Stmt\Function_) {
Chris@0	422 $node->name = $node->namespacedName->toString('_');
Chris@0	423 } elseif ($node instanceof Stmt\Const_) {
Chris@0	424 foreach ($node->consts as $const) {
Chris@0	425 $const->name = $const->namespacedName->toString('_');
Chris@0	426 }
Chris@0	427 } elseif ($node instanceof Stmt\Namespace_) {
Chris@0	428 // returning an array merges is into the parent array
Chris@0	429 return $node->stmts;
Chris@0	430 } elseif ($node instanceof Stmt\Use_) {
Chris@0	431 // returning false removed the node altogether
Chris@0	432 return false;
Chris@0	433 }
Chris@0	434 }
Chris@0	435 }
Chris@0	436 ```
Chris@0	437
Chris@0	438 That's all.

Mercurial > hg > isophonics-drupal-site

annotate vendor/nikic/php-parser/doc/2_Usage_of_basic_components.markdown @ 2:92f882872392