Mercurial > hg > isophonics-drupal-site
comparison vendor/nikic/php-parser/doc/2_Usage_of_basic_components.markdown @ 0:4c8ae668cc8c
Initial import (non-working)
author | Chris Cannam |
---|---|
date | Wed, 29 Nov 2017 16:09:58 +0000 |
parents | |
children | 5fb285c0d0e3 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4c8ae668cc8c |
---|---|
1 Usage of basic components | |
2 ========================= | |
3 | |
4 This document explains how to use the parser, the pretty printer and the node traverser. | |
5 | |
6 Bootstrapping | |
7 ------------- | |
8 | |
9 To bootstrap the library, include the autoloader generated by composer: | |
10 | |
11 ```php | |
12 require 'path/to/vendor/autoload.php'; | |
13 ``` | |
14 | |
15 Additionally you may want to set the `xdebug.max_nesting_level` ini option to a higher value: | |
16 | |
17 ```php | |
18 ini_set('xdebug.max_nesting_level', 3000); | |
19 ``` | |
20 | |
21 This ensures that there will be no errors when traversing highly nested node trees. However, it is | |
22 preferable to disable XDebug completely, as it can easily make this library more than five times | |
23 slower. | |
24 | |
25 Parsing | |
26 ------- | |
27 | |
28 In order to parse code, you first have to create a parser instance: | |
29 | |
30 ```php | |
31 use PhpParser\ParserFactory; | |
32 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7); | |
33 ``` | |
34 | |
35 The factory accepts a kind argument, that determines how different PHP versions are treated: | |
36 | |
37 Kind | Behavior | |
38 -----|--------- | |
39 `ParserFactory::PREFER_PHP7` | Try to parse code as PHP 7. If this fails, try to parse it as PHP 5. | |
40 `ParserFactory::PREFER_PHP5` | Try to parse code as PHP 5. If this fails, try to parse it as PHP 7. | |
41 `ParserFactory::ONLY_PHP7` | Parse code as PHP 7. | |
42 `ParserFactory::ONLY_PHP5` | Parse code as PHP 5. | |
43 | |
44 Unless you have strong reason to use something else, `PREFER_PHP7` is a reasonable default. | |
45 | |
46 The `create()` method optionally accepts a `Lexer` instance as the second argument. Some use cases | |
47 that require customized lexers are discussed in the [lexer documentation](component/Lexer.markdown). | |
48 | |
49 Subsequently you can pass PHP code (including the opening `<?php` tag) to the `parse` method in order to | |
50 create a syntax tree. If a syntax error is encountered, an `PhpParser\Error` exception will be thrown: | |
51 | |
52 ```php | |
53 use PhpParser\Error; | |
54 use PhpParser\ParserFactory; | |
55 | |
56 $code = '<?php // some code'; | |
57 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7); | |
58 | |
59 try { | |
60 $stmts = $parser->parse($code); | |
61 // $stmts is an array of statement nodes | |
62 } catch (Error $e) { | |
63 echo 'Parse Error: ', $e->getMessage(); | |
64 } | |
65 ``` | |
66 | |
67 A parser instance can be reused to parse multiple files. | |
68 | |
69 Node tree | |
70 --------- | |
71 | |
72 If you use the above code with `$code = "<?php echo 'Hi ', hi\\getTarget();"` the parser will | |
73 generate a node tree looking like this: | |
74 | |
75 ``` | |
76 array( | |
77 0: Stmt_Echo( | |
78 exprs: array( | |
79 0: Scalar_String( | |
80 value: Hi | |
81 ) | |
82 1: Expr_FuncCall( | |
83 name: Name( | |
84 parts: array( | |
85 0: hi | |
86 1: getTarget | |
87 ) | |
88 ) | |
89 args: array( | |
90 ) | |
91 ) | |
92 ) | |
93 ) | |
94 ) | |
95 ``` | |
96 | |
97 Thus `$stmts` will contain an array with only one node, with this node being an instance of | |
98 `PhpParser\Node\Stmt\Echo_`. | |
99 | |
100 As PHP is a large language there are approximately 140 different nodes. In order to make work | |
101 with them easier they are grouped into three categories: | |
102 | |
103 * `PhpParser\Node\Stmt`s are statement nodes, i.e. language constructs that do not return | |
104 a value and can not occur in an expression. For example a class definition is a statement. | |
105 It doesn't return a value and you can't write something like `func(class A {});`. | |
106 * `PhpParser\Node\Expr`s are expression nodes, i.e. language constructs that return a value | |
107 and thus can occur in other expressions. Examples of expressions are `$var` | |
108 (`PhpParser\Node\Expr\Variable`) and `func()` (`PhpParser\Node\Expr\FuncCall`). | |
109 * `PhpParser\Node\Scalar`s are nodes representing scalar values, like `'string'` | |
110 (`PhpParser\Node\Scalar\String_`), `0` (`PhpParser\Node\Scalar\LNumber`) or magic constants | |
111 like `__FILE__` (`PhpParser\Node\Scalar\MagicConst\File`). All `PhpParser\Node\Scalar`s extend | |
112 `PhpParser\Node\Expr`, as scalars are expressions, too. | |
113 * There are some nodes not in either of these groups, for example names (`PhpParser\Node\Name`) | |
114 and call arguments (`PhpParser\Node\Arg`). | |
115 | |
116 Some node class names have a trailing `_`. This is used whenever the class name would otherwise clash | |
117 with a PHP keyword. | |
118 | |
119 Every node has a (possibly zero) number of subnodes. You can access subnodes by writing | |
120 `$node->subNodeName`. The `Stmt\Echo_` node has only one subnode `exprs`. So in order to access it | |
121 in the above example you would write `$stmts[0]->exprs`. If you wanted to access the name of the function | |
122 call, you would write `$stmts[0]->exprs[1]->name`. | |
123 | |
124 All nodes also define a `getType()` method that returns the node type. The type is the class name | |
125 without the `PhpParser\Node\` prefix and `\` replaced with `_`. It also does not contain a trailing | |
126 `_` for reserved-keyword class names. | |
127 | |
128 It is possible to associate custom metadata with a node using the `setAttribute()` method. This data | |
129 can then be retrieved using `hasAttribute()`, `getAttribute()` and `getAttributes()`. | |
130 | |
131 By default the lexer adds the `startLine`, `endLine` and `comments` attributes. `comments` is an array | |
132 of `PhpParser\Comment[\Doc]` instances. | |
133 | |
134 The start line can also be accessed using `getLine()`/`setLine()` (instead of `getAttribute('startLine')`). | |
135 The last doc comment from the `comments` attribute can be obtained using `getDocComment()`. | |
136 | |
137 Pretty printer | |
138 -------------- | |
139 | |
140 The pretty printer component compiles the AST back to PHP code. As the parser does not retain formatting | |
141 information the formatting is done using a specified scheme. Currently there is only one scheme available, | |
142 namely `PhpParser\PrettyPrinter\Standard`. | |
143 | |
144 ```php | |
145 use PhpParser\Error; | |
146 use PhpParser\ParserFactory; | |
147 use PhpParser\PrettyPrinter; | |
148 | |
149 $code = "<?php echo 'Hi ', hi\\getTarget();"; | |
150 | |
151 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7); | |
152 $prettyPrinter = new PrettyPrinter\Standard; | |
153 | |
154 try { | |
155 // parse | |
156 $stmts = $parser->parse($code); | |
157 | |
158 // change | |
159 $stmts[0] // the echo statement | |
160 ->exprs // sub expressions | |
161 [0] // the first of them (the string node) | |
162 ->value // it's value, i.e. 'Hi ' | |
163 = 'Hello '; // change to 'Hello ' | |
164 | |
165 // pretty print | |
166 $code = $prettyPrinter->prettyPrint($stmts); | |
167 | |
168 echo $code; | |
169 } catch (Error $e) { | |
170 echo 'Parse Error: ', $e->getMessage(); | |
171 } | |
172 ``` | |
173 | |
174 The above code will output: | |
175 | |
176 <?php echo 'Hello ', hi\getTarget(); | |
177 | |
178 As you can see the source code was first parsed using `PhpParser\Parser->parse()`, then changed and then | |
179 again converted to code using `PhpParser\PrettyPrinter\Standard->prettyPrint()`. | |
180 | |
181 The `prettyPrint()` method pretty prints a statements array. It is also possible to pretty print only a | |
182 single expression using `prettyPrintExpr()`. | |
183 | |
184 The `prettyPrintFile()` method can be used to print an entire file. This will include the opening `<?php` tag | |
185 and handle inline HTML as the first/last statement more gracefully. | |
186 | |
187 Node traversation | |
188 ----------------- | |
189 | |
190 The above pretty printing example used the fact that the source code was known and thus it was easy to | |
191 write code that accesses a certain part of a node tree and changes it. Normally this is not the case. | |
192 Usually you want to change / analyze code in a generic way, where you don't know how the node tree is | |
193 going to look like. | |
194 | |
195 For this purpose the parser provides a component for traversing and visiting the node tree. The basic | |
196 structure of a program using this `PhpParser\NodeTraverser` looks like this: | |
197 | |
198 ```php | |
199 use PhpParser\NodeTraverser; | |
200 use PhpParser\ParserFactory; | |
201 use PhpParser\PrettyPrinter; | |
202 | |
203 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7); | |
204 $traverser = new NodeTraverser; | |
205 $prettyPrinter = new PrettyPrinter\Standard; | |
206 | |
207 // add your visitor | |
208 $traverser->addVisitor(new MyNodeVisitor); | |
209 | |
210 try { | |
211 $code = file_get_contents($fileName); | |
212 | |
213 // parse | |
214 $stmts = $parser->parse($code); | |
215 | |
216 // traverse | |
217 $stmts = $traverser->traverse($stmts); | |
218 | |
219 // pretty print | |
220 $code = $prettyPrinter->prettyPrintFile($stmts); | |
221 | |
222 echo $code; | |
223 } catch (PhpParser\Error $e) { | |
224 echo 'Parse Error: ', $e->getMessage(); | |
225 } | |
226 ``` | |
227 | |
228 The corresponding node visitor might look like this: | |
229 | |
230 ```php | |
231 use PhpParser\Node; | |
232 use PhpParser\NodeVisitorAbstract; | |
233 | |
234 class MyNodeVisitor extends NodeVisitorAbstract | |
235 { | |
236 public function leaveNode(Node $node) { | |
237 if ($node instanceof Node\Scalar\String_) { | |
238 $node->value = 'foo'; | |
239 } | |
240 } | |
241 } | |
242 ``` | |
243 | |
244 The above node visitor would change all string literals in the program to `'foo'`. | |
245 | |
246 All visitors must implement the `PhpParser\NodeVisitor` interface, which defines the following four | |
247 methods: | |
248 | |
249 ```php | |
250 public function beforeTraverse(array $nodes); | |
251 public function enterNode(\PhpParser\Node $node); | |
252 public function leaveNode(\PhpParser\Node $node); | |
253 public function afterTraverse(array $nodes); | |
254 ``` | |
255 | |
256 The `beforeTraverse()` method is called once before the traversal begins and is passed the nodes the | |
257 traverser was called with. This method can be used for resetting values before traversation or | |
258 preparing the tree for traversal. | |
259 | |
260 The `afterTraverse()` method is similar to the `beforeTraverse()` method, with the only difference that | |
261 it is called once after the traversal. | |
262 | |
263 The `enterNode()` and `leaveNode()` methods are called on every node, the former when it is entered, | |
264 i.e. before its subnodes are traversed, the latter when it is left. | |
265 | |
266 All four methods can either return the changed node or not return at all (i.e. `null`) in which | |
267 case the current node is not changed. | |
268 | |
269 The `enterNode()` method can additionally return the value `NodeTraverser::DONT_TRAVERSE_CHILDREN`, | |
270 which instructs the traverser to skip all children of the current node. | |
271 | |
272 The `leaveNode()` method can additionally return the value `NodeTraverser::REMOVE_NODE`, in which | |
273 case the current node will be removed from the parent array. Furthermore it is possible to return | |
274 an array of nodes, which will be merged into the parent array at the offset of the current node. | |
275 I.e. if in `array(A, B, C)` the node `B` should be replaced with `array(X, Y, Z)` the result will | |
276 be `array(A, X, Y, Z, C)`. | |
277 | |
278 Instead of manually implementing the `NodeVisitor` interface you can also extend the `NodeVisitorAbstract` | |
279 class, which will define empty default implementations for all the above methods. | |
280 | |
281 The NameResolver node visitor | |
282 ----------------------------- | |
283 | |
284 One visitor is already bundled with the package: `PhpParser\NodeVisitor\NameResolver`. This visitor | |
285 helps you work with namespaced code by trying to resolve most names to fully qualified ones. | |
286 | |
287 For example, consider the following code: | |
288 | |
289 use A as B; | |
290 new B\C(); | |
291 | |
292 In order to know that `B\C` really is `A\C` you would need to track aliases and namespaces yourself. | |
293 The `NameResolver` takes care of that and resolves names as far as possible. | |
294 | |
295 After running it most names will be fully qualified. The only names that will stay unqualified are | |
296 unqualified function and constant names. These are resolved at runtime and thus the visitor can't | |
297 know which function they are referring to. In most cases this is a non-issue as the global functions | |
298 are meant. | |
299 | |
300 Also the `NameResolver` adds a `namespacedName` subnode to class, function and constant declarations | |
301 that contains the namespaced name instead of only the shortname that is available via `name`. | |
302 | |
303 Example: Converting namespaced code to pseudo namespaces | |
304 -------------------------------------------------------- | |
305 | |
306 A small example to understand the concept: We want to convert namespaced code to pseudo namespaces | |
307 so it works on 5.2, i.e. names like `A\\B` should be converted to `A_B`. Note that such conversions | |
308 are fairly complicated if you take PHP's dynamic features into account, so our conversion will | |
309 assume that no dynamic features are used. | |
310 | |
311 We start off with the following base code: | |
312 | |
313 ```php | |
314 use PhpParser\ParserFactory; | |
315 use PhpParser\PrettyPrinter; | |
316 use PhpParser\NodeTraverser; | |
317 use PhpParser\NodeVisitor\NameResolver; | |
318 | |
319 $inDir = '/some/path'; | |
320 $outDir = '/some/other/path'; | |
321 | |
322 $parser = (new ParserFactory)->create(ParserFactory::PREFER_PHP7); | |
323 $traverser = new NodeTraverser; | |
324 $prettyPrinter = new PrettyPrinter\Standard; | |
325 | |
326 $traverser->addVisitor(new NameResolver); // we will need resolved names | |
327 $traverser->addVisitor(new NamespaceConverter); // our own node visitor | |
328 | |
329 // iterate over all .php files in the directory | |
330 $files = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($inDir)); | |
331 $files = new \RegexIterator($files, '/\.php$/'); | |
332 | |
333 foreach ($files as $file) { | |
334 try { | |
335 // read the file that should be converted | |
336 $code = file_get_contents($file); | |
337 | |
338 // parse | |
339 $stmts = $parser->parse($code); | |
340 | |
341 // traverse | |
342 $stmts = $traverser->traverse($stmts); | |
343 | |
344 // pretty print | |
345 $code = $prettyPrinter->prettyPrintFile($stmts); | |
346 | |
347 // write the converted file to the target directory | |
348 file_put_contents( | |
349 substr_replace($file->getPathname(), $outDir, 0, strlen($inDir)), | |
350 $code | |
351 ); | |
352 } catch (PhpParser\Error $e) { | |
353 echo 'Parse Error: ', $e->getMessage(); | |
354 } | |
355 } | |
356 ``` | |
357 | |
358 Now lets start with the main code, the `NodeVisitor\NamespaceConverter`. One thing it needs to do | |
359 is convert `A\\B` style names to `A_B` style ones. | |
360 | |
361 ```php | |
362 use PhpParser\Node; | |
363 | |
364 class NamespaceConverter extends \PhpParser\NodeVisitorAbstract | |
365 { | |
366 public function leaveNode(Node $node) { | |
367 if ($node instanceof Node\Name) { | |
368 return new Node\Name($node->toString('_')); | |
369 } | |
370 } | |
371 } | |
372 ``` | |
373 | |
374 The above code profits from the fact that the `NameResolver` already resolved all names as far as | |
375 possible, so we don't need to do that. We only need to create a string with the name parts separated | |
376 by underscores instead of backslashes. This is what `$node->toString('_')` does. (If you want to | |
377 create a name with backslashes either write `$node->toString()` or `(string) $node`.) Then we create | |
378 a new name from the string and return it. Returning a new node replaces the old node. | |
379 | |
380 Another thing we need to do is change the class/function/const declarations. Currently they contain | |
381 only the shortname (i.e. the last part of the name), but they need to contain the complete name including | |
382 the namespace prefix: | |
383 | |
384 ```php | |
385 use PhpParser\Node; | |
386 use PhpParser\Node\Stmt; | |
387 | |
388 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract | |
389 { | |
390 public function leaveNode(Node $node) { | |
391 if ($node instanceof Node\Name) { | |
392 return new Node\Name($node->toString('_')); | |
393 } elseif ($node instanceof Stmt\Class_ | |
394 || $node instanceof Stmt\Interface_ | |
395 || $node instanceof Stmt\Function_) { | |
396 $node->name = $node->namespacedName->toString('_'); | |
397 } elseif ($node instanceof Stmt\Const_) { | |
398 foreach ($node->consts as $const) { | |
399 $const->name = $const->namespacedName->toString('_'); | |
400 } | |
401 } | |
402 } | |
403 } | |
404 ``` | |
405 | |
406 There is not much more to it than converting the namespaced name to string with `_` as separator. | |
407 | |
408 The last thing we need to do is remove the `namespace` and `use` statements: | |
409 | |
410 ```php | |
411 use PhpParser\Node; | |
412 use PhpParser\Node\Stmt; | |
413 | |
414 class NodeVisitor_NamespaceConverter extends \PhpParser\NodeVisitorAbstract | |
415 { | |
416 public function leaveNode(Node $node) { | |
417 if ($node instanceof Node\Name) { | |
418 return new Node\Name($node->toString('_')); | |
419 } elseif ($node instanceof Stmt\Class_ | |
420 || $node instanceof Stmt\Interface_ | |
421 || $node instanceof Stmt\Function_) { | |
422 $node->name = $node->namespacedName->toString('_'); | |
423 } elseif ($node instanceof Stmt\Const_) { | |
424 foreach ($node->consts as $const) { | |
425 $const->name = $const->namespacedName->toString('_'); | |
426 } | |
427 } elseif ($node instanceof Stmt\Namespace_) { | |
428 // returning an array merges is into the parent array | |
429 return $node->stmts; | |
430 } elseif ($node instanceof Stmt\Use_) { | |
431 // returning false removed the node altogether | |
432 return false; | |
433 } | |
434 } | |
435 } | |
436 ``` | |
437 | |
438 That's all. |