cmmr2012-drupal-site: vendor/nikic/php-parser/doc/component/Walking_the

annotate vendor/nikic/php-parser/doc/component/Walking_the_AST.markdown @ 5:12f9dff5fda9 tip

Update to Drupal core 8.7.1

author	Chris Cannam
date	Thu, 09 May 2019 15:34:47 +0100
parents	a9cd425dd02b
children

rev	line source
Chris@0	1 Walking the AST
Chris@0	2 ===============
Chris@0	3
Chris@0	4 The most common way to work with the AST is by using a node traverser and one or more node visitors.
Chris@0	5 As a basic example, the following code changes all literal integers in the AST into strings (e.g.,
Chris@0	6 `42` becomes `'42'`.)
Chris@0	7
Chris@0	8 ```php
Chris@0	9 use PhpParser\{Node, NodeTraverser, NodeVisitorAbstract};
Chris@0	10
Chris@0	11 $traverser = new NodeTraverser;
Chris@0	12 $traverser->addVisitor(new class extends NodeVisitorAbstract {
Chris@0	13 public function leaveNode(Node $node) {
Chris@0	14 if ($node instanceof Node\Scalar\LNumber) {
Chris@0	15 return new Node\Scalar\String_((string) $node->value);
Chris@0	16 }
Chris@0	17 }
Chris@0	18 });
Chris@0	19
Chris@0	20 $stmts = ...;
Chris@0	21 $modifiedStmts = $traverser->traverse($stmts);
Chris@0	22 ```
Chris@0	23
Chris@0	24 Node visitors
Chris@0	25 -------------
Chris@0	26
Chris@0	27 Each node visitor implements an interface with following four methods:
Chris@0	28
Chris@0	29 ```php
Chris@0	30 interface NodeVisitor {
Chris@0	31 public function beforeTraverse(array $nodes);
Chris@0	32 public function enterNode(Node $node);
Chris@0	33 public function leaveNode(Node $node);
Chris@0	34 public function afterTraverse(array $nodes);
Chris@0	35 }
Chris@0	36 ```
Chris@0	37
Chris@0	38 The `beforeTraverse()` and `afterTraverse()` methods are called before and after the traversal
Chris@0	39 respectively, and are passed the entire AST. They can be used to perform any necessary state
Chris@0	40 setup or cleanup.
Chris@0	41
Chris@0	42 The `enterNode()` method is called when a node is first encountered, before its children are
Chris@0	43 processed ("preorder"). The `leaveNode()` method is called after all children have been visited
Chris@0	44 ("postorder").
Chris@0	45
Chris@0	46 For example, if we have the following excerpt of an AST
Chris@0	47
Chris@0	48 ```
Chris@0	49 Expr_FuncCall(
Chris@0	50 name: Name(
Chris@0	51 parts: array(
Chris@0	52 0: printLine
Chris@0	53 )
Chris@0	54 )
Chris@0	55 args: array(
Chris@0	56 0: Arg(
Chris@0	57 value: Scalar_String(
Chris@0	58 value: Hello World!!!
Chris@0	59 )
Chris@0	60 byRef: false
Chris@0	61 unpack: false
Chris@0	62 )
Chris@0	63 )
Chris@0	64 )
Chris@0	65 ```
Chris@0	66
Chris@0	67 then the enter/leave methods will be called in the following order:
Chris@0	68
Chris@0	69 ```
Chris@0	70 enterNode(Expr_FuncCall)
Chris@0	71 enterNode(Name)
Chris@0	72 leaveNode(Name)
Chris@0	73 enterNode(Arg)
Chris@0	74 enterNode(Scalar_String)
Chris@0	75 leaveNode(Scalar_String)
Chris@0	76 leaveNode(Arg)
Chris@0	77 leaveNode(Expr_FuncCall)
Chris@0	78 ```
Chris@0	79
Chris@0	80 A common pattern is that `enterNode` is used to collect some information and then `leaveNode`
Chris@0	81 performs modifications based on that. At the time when `leaveNode` is called, all the code inside
Chris@0	82 the node will have already been visited and necessary information collected.
Chris@0	83
Chris@0	84 As you usually do not want to implement all four methods, it is recommended that you extend
Chris@0	85 `NodeVisitorAbstract` instead of implementing the interface directly. The abstract class provides
Chris@0	86 empty default implementations.
Chris@0	87
Chris@0	88 Modifying the AST
Chris@0	89 -----------------
Chris@0	90
Chris@0	91 There are a number of ways in which the AST can be modified from inside a node visitor. The first
Chris@0	92 and simplest is to simply change AST properties inside the visitor:
Chris@0	93
Chris@0	94 ```php
Chris@0	95 public function leaveNode(Node $node) {
Chris@0	96 if ($node instanceof Node\Scalar\LNumber) {
Chris@0	97 // increment all integer literals
Chris@0	98 $node->value++;
Chris@0	99 }
Chris@0	100 }
Chris@0	101 ```
Chris@0	102
Chris@0	103 The second is to replace a node entirely by returning a new node:
Chris@0	104
Chris@0	105 ```php
Chris@0	106 public function leaveNode(Node $node) {
Chris@0	107 if ($node instanceof Node\Expr\BinaryOp\BooleanAnd) {
Chris@0	108 // Convert all $a && $b expressions into !($a && $b)
Chris@0	109 return new Node\Expr\BooleanNot($node);
Chris@0	110 }
Chris@0	111 }
Chris@0	112 ```
Chris@0	113
Chris@0	114 Doing this is supported both inside enterNode and leaveNode. However, you have to be mindful about
Chris@0	115 where you perform the replacement: If a node is replaced in enterNode, then the recursive traversal
Chris@0	116 will also consider the children of the new node. If you aren't careful, this can lead to infinite
Chris@0	117 recursion. For example, let's take the previous code sample and use enterNode instead:
Chris@0	118
Chris@0	119 ```php
Chris@0	120 public function enterNode(Node $node) {
Chris@0	121 if ($node instanceof Node\Expr\BinaryOp\BooleanAnd) {
Chris@0	122 // Convert all $a && $b expressions into !($a && $b)
Chris@0	123 return new Node\Expr\BooleanNot($node);
Chris@0	124 }
Chris@0	125 }
Chris@0	126 ```
Chris@0	127
Chris@0	128 Now `$a && $b` will be replaced by `!($a && $b)`. Then the traverser will go into the first (and
Chris@0	129 only) child of `!($a && $b)`, which is `$a && $b`. The transformation applies again and we end up
Chris@0	130 with `!!($a && $b)`. This will continue until PHP hits the memory limit.
Chris@0	131
Chris@0	132 Finally, two special replacement types are supported only by leaveNode. The first is removal of a
Chris@0	133 node:
Chris@0	134
Chris@0	135 ```php
Chris@0	136 public function leaveNode(Node $node) {
Chris@0	137 if ($node instanceof Node\Stmt\Return_) {
Chris@0	138 // Remove all return statements
Chris@0	139 return NodeTraverser::REMOVE_NODE;
Chris@0	140 }
Chris@0	141 }
Chris@0	142 ```
Chris@0	143
Chris@0	144 Node removal only works if the parent structure is an array. This means that usually it only makes
Chris@0	145 sense to remove nodes of type `Node\Stmt`, as they always occur inside statement lists (and a few
Chris@0	146 more node types like `Arg` or `Expr\ArrayItem`, which are also always part of lists).
Chris@0	147
Chris@0	148 On the other hand, removing a `Node\Expr` does not make sense: If you have `$a * $b`, there is no
Chris@0	149 meaningful way in which the `$a` part could be removed. If you want to remove an expression, you
Chris@0	150 generally want to remove it together with a surrounding expression statement:
Chris@0	151
Chris@0	152 ```php
Chris@0	153 public function leaveNode(Node $node) {
Chris@0	154 if ($node instanceof Node\Stmt\Expression
Chris@0	155 && $node->expr instanceof Node\Expr\FuncCall
Chris@0	156 && $node->expr->name instanceof Node\Name
Chris@0	157 && $node->expr->name->toString() === 'var_dump'
Chris@0	158 ) {
Chris@0	159 return NodeTraverser::REMOVE_NODE;
Chris@0	160 }
Chris@0	161 }
Chris@0	162 ```
Chris@0	163
Chris@0	164 This example will remove all calls to `var_dump()` which occur as expression statements. This means
Chris@0	165 that `var_dump($a);` will be removed, but `if (var_dump($a))` will not be removed (and there is no
Chris@0	166 obvious way in which it can be removed).
Chris@0	167
Chris@0	168 Next to removing nodes, it is also possible to replace one node with multiple nodes. Again, this
Chris@0	169 only works inside leaveNode and only if the parent structure is an array.
Chris@0	170
Chris@0	171 ```php
Chris@0	172 public function leaveNode(Node $node) {
Chris@0	173 if ($node instanceof Node\Stmt\Return_ && $node->expr !== null) {
Chris@0	174 // Convert "return foo();" into "$retval = foo(); return $retval;"
Chris@0	175 $var = new Node\Expr\Variable('retval');
Chris@0	176 return [
Chris@0	177 new Node\Stmt\Expression(new Node\Expr\Assign($var, $node->expr)),
Chris@0	178 new Node\Stmt\Return_($var),
Chris@0	179 ];
Chris@0	180 }
Chris@0	181 }
Chris@0	182 ```
Chris@0	183
Chris@0	184 Short-circuiting traversal
Chris@0	185 --------------------------
Chris@0	186
Chris@0	187 An AST can easily contain thousands of nodes, and traversing over all of them may be slow,
Chris@0	188 especially if you have more than one visitor. In some cases, it is possible to avoid a full
Chris@0	189 traversal.
Chris@0	190
Chris@0	191 If you are looking for all class declarations in a file (and assuming you're not interested in
Chris@0	192 anonymous classes), you know that once you've seen a class declaration, there is no point in also
Chris@0	193 checking all it's child nodes, because PHP does not allow nesting classes. In this case, you can
Chris@0	194 instruct the traverser to not recurse into the class node:
Chris@0	195
Chris@0	196 ```
Chris@0	197 private $classes = [];
Chris@0	198 public function enterNode(Node $node) {
Chris@0	199 if ($node instanceof Node\Stmt\Class_) {
Chris@0	200 $this->classes[] = $node;
Chris@0	201 return NodeTraverser::DONT_TRAVERSE_CHILDREN;
Chris@0	202 }
Chris@0	203 }
Chris@0	204 ```
Chris@0	205
Chris@0	206 Of course, this option is only available in enterNode, because it's already too late by the time
Chris@0	207 leaveNode is reached.
Chris@0	208
Chris@0	209 If you are only looking for one specific node, it is also possible to abort the traversal entirely
Chris@0	210 after finding it. For example, if you are looking for the node of a class with a certain name (and
Chris@0	211 discounting exotic cases like conditionally defining a class two times), you can stop traversal
Chris@0	212 once you found it:
Chris@0	213
Chris@0	214 ```
Chris@0	215 private $class = null;
Chris@0	216 public function enterNode(Node $node) {
Chris@0	217 if ($node instanceof Node\Stmt\Class_ &&
Chris@4	218 $node->namespacedName->toString() === 'Foo\Bar\Baz'
Chris@0	219 ) {
Chris@0	220 $this->class = $node;
Chris@0	221 return NodeTraverser::STOP_TRAVERSAL;
Chris@0	222 }
Chris@0	223 }
Chris@0	224 ```
Chris@0	225
Chris@0	226 This works both in enterNode and leaveNode. Note that this particular case can also be more easily
Chris@0	227 handled using a NodeFinder, which will be introduced below.
Chris@0	228
Chris@0	229 Multiple visitors
Chris@0	230 -----------------
Chris@0	231
Chris@0	232 A single traverser can be used with multiple visitors:
Chris@0	233
Chris@0	234 ```php
Chris@0	235 $traverser = new NodeTraverser;
Chris@0	236 $traverser->addVisitor($visitorA);
Chris@0	237 $traverser->addVisitor($visitorB);
Chris@4	238 $stmts = $traverser->traverse($stmts);
Chris@0	239 ```
Chris@0	240
Chris@0	241 It is important to understand that if a traverser is run with multiple visitors, the visitors will
Chris@0	242 be interleaved. Given the following AST excerpt
Chris@0	243
Chris@0	244 ```
Chris@0	245 Stmt_Return(
Chris@0	246 expr: Expr_Variable(
Chris@0	247 name: foobar
Chris@0	248 )
Chris@0	249 )
Chris@0	250 ```
Chris@0	251
Chris@0	252 the following method calls will be performed:
Chris@0	253
Chris@0	254 ```
Chris@0	255 $visitorA->enterNode(Stmt_Return)
Chris@0	256 $visitorB->enterNode(Stmt_Return)
Chris@0	257 $visitorA->enterNode(Expr_Variable)
Chris@0	258 $visitorB->enterNode(Expr_Variable)
Chris@0	259 $visitorA->leaveNode(Expr_Variable)
Chris@0	260 $visitorB->leaveNode(Expr_Variable)
Chris@0	261 $visitorA->leaveNode(Stmt_Return)
Chris@0	262 $visitorB->leaveNode(Stmt_Return)
Chris@0	263 ```
Chris@0	264
Chris@0	265 That is, when visiting a node, enterNode and leaveNode will always be called for all visitors.
Chris@0	266 Running multiple visitors in parallel improves performance, as the AST only has to be traversed
Chris@0	267 once. However, it is not always possible to write visitors in a way that allows interleaved
Chris@0	268 execution. In this case, you can always fall back to performing multiple traversals:
Chris@0	269
Chris@0	270 ```php
Chris@0	271 $traverserA = new NodeTraverser;
Chris@0	272 $traverserA->addVisitor($visitorA);
Chris@0	273 $traverserB = new NodeTraverser;
Chris@0	274 $traverserB->addVisitor($visitorB);
Chris@0	275 $stmts = $traverserA->traverser($stmts);
Chris@0	276 $stmts = $traverserB->traverser($stmts);
Chris@0	277 ```
Chris@0	278
Chris@0	279 When using multiple visitors, it is important to understand how they interact with the various
Chris@0	280 special enterNode/leaveNode return values:
Chris@0	281
Chris@0	282 * If any visitor returns `DONT_TRAVERSE_CHILDREN`, the children will be skipped for all
Chris@0	283 visitors.
Chris@4	284 * If any visitor returns `DONT_TRAVERSE_CURRENT_AND_CHILDREN`, the children will be skipped for all
Chris@4	285 visitors, and all subsequent visitors will not visit the current node.
Chris@0	286 * If any visitor returns `STOP_TRAVERSAL`, traversal is stopped for all visitors.
Chris@0	287 * If a visitor returns a replacement node, subsequent visitors will be passed the replacement node,
Chris@0	288 not the original one.
Chris@0	289 * If a visitor returns `REMOVE_NODE`, subsequent visitors will not see this node.
Chris@0	290 * If a visitor returns an array of replacement nodes, subsequent visitors will see neither the node
Chris@0	291 that was replaced, nor the replacement nodes.
Chris@0	292
Chris@0	293 Simple node finding
Chris@0	294 -------------------
Chris@0	295
Chris@0	296 While the node visitor mechanism is very flexible, creating a node visitor can be overly cumbersome
Chris@0	297 for minor tasks. For this reason a `NodeFinder` is provided, which can find AST nodes that either
Chris@0	298 satisfy a certain callback, or which are instanced of a certain node type. A couple of examples are
Chris@0	299 shown in the following:
Chris@0	300
Chris@0	301 ```php
Chris@0	302 use PhpParser\{Node, NodeFinder};
Chris@0	303
Chris@0	304 $nodeFinder = new NodeFinder;
Chris@0	305
Chris@0	306 // Find all class nodes.
Chris@0	307 $classes = $nodeFinder->findInstanceOf($stmts, Node\Stmt\Class_::class);
Chris@0	308
Chris@0	309 // Find all classes that extend another class
Chris@4	310 $extendingClasses = $nodeFinder->find($stmts, function(Node $node) {
Chris@0	311 return $node instanceof Node\Stmt\Class_
Chris@0	312 && $node->extends !== null;
Chris@0	313 });
Chris@0	314
Chris@0	315 // Find first class occuring in the AST. Returns null if no class exists.
Chris@0	316 $class = $nodeFinder->findFirstInstanceOf($stmts, Node\Stmt\Class_::class);
Chris@0	317
Chris@0	318 // Find first class that has name $name
Chris@0	319 $class = $nodeFinder->findFirst($stmts, function(Node $node) use ($name) {
Chris@0	320 return $node instanceof Node\Stmt\Class_
Chris@0	321 && $node->resolvedName->toString() === $name;
Chris@0	322 });
Chris@0	323 ```
Chris@0	324
Chris@0	325 Internally, the `NodeFinder` also uses a node traverser. It only simplifies the interface for a
Chris@0	326 common use case.
Chris@0	327
Chris@0	328 Parent and sibling references
Chris@0	329 -----------------------------
Chris@0	330
Chris@0	331 The node visitor mechanism is somewhat rigid, in that it prescribes an order in which nodes should
Chris@0	332 be accessed: From parents to children. However, it can often be convenient to operate in the
Chris@0	333 reverse direction: When working on a node, you might want to check if the parent node satisfies a
Chris@0	334 certain property.
Chris@0	335
Chris@0	336 PHP-Parser does not add parent (or sibling) references to nodes by itself, but you can easily
Chris@4	337 emulate this with a visitor. See the [FAQ](FAQ.markdown) for more information.

Mercurial > hg > cmmr2012-drupal-site

annotate vendor/nikic/php-parser/doc/component/Walking_the_AST.markdown @ 5:12f9dff5fda9 tip