Mercurial > hg > isophonics-drupal-site
comparison vendor/nikic/php-parser/doc/0_Introduction.markdown @ 0:4c8ae668cc8c
Initial import (non-working)
author | Chris Cannam |
---|---|
date | Wed, 29 Nov 2017 16:09:58 +0000 |
parents | |
children | 5fb285c0d0e3 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:4c8ae668cc8c |
---|---|
1 Introduction | |
2 ============ | |
3 | |
4 This project is a PHP 5.2 to PHP 7.1 parser **written in PHP itself**. | |
5 | |
6 What is this for? | |
7 ----------------- | |
8 | |
9 A parser is useful for [static analysis][0], manipulation of code and basically any other | |
10 application dealing with code programmatically. A parser constructs an [Abstract Syntax Tree][1] | |
11 (AST) of the code and thus allows dealing with it in an abstract and robust way. | |
12 | |
13 There are other ways of processing source code. One that PHP supports natively is using the | |
14 token stream generated by [`token_get_all`][2]. The token stream is much more low level than | |
15 the AST and thus has different applications: It allows to also analyze the exact formatting of | |
16 a file. On the other hand the token stream is much harder to deal with for more complex analysis. | |
17 For example an AST abstracts away the fact that in PHP variables can be written as `$foo`, but also | |
18 as `$$bar`, `${'foobar'}` or even `${!${''}=barfoo()}`. You don't have to worry about recognizing | |
19 all the different syntaxes from a stream of tokens. | |
20 | |
21 Another question is: Why would I want to have a PHP parser *written in PHP*? Well, PHP might not be | |
22 a language especially suited for fast parsing, but processing the AST is much easier in PHP than it | |
23 would be in other, faster languages like C. Furthermore the people most probably wanting to do | |
24 programmatic PHP code analysis are incidentally PHP developers, not C developers. | |
25 | |
26 What can it parse? | |
27 ------------------ | |
28 | |
29 The parser supports parsing PHP 5.2-5.6 and PHP 7. | |
30 | |
31 As the parser is based on the tokens returned by `token_get_all` (which is only able to lex the PHP | |
32 version it runs on), additionally a wrapper for emulating tokens from newer versions is provided. | |
33 This allows to parse PHP 7.1 source code running on PHP 5.5, for example. This emulation is somewhat | |
34 hacky and not perfect, but it should work well on any sane code. | |
35 | |
36 What output does it produce? | |
37 ---------------------------- | |
38 | |
39 The parser produces an [Abstract Syntax Tree][1] (AST) also known as a node tree. How this looks like | |
40 can best be seen in an example. The program `<?php echo 'Hi', 'World';` will give you a node tree | |
41 roughly looking like this: | |
42 | |
43 ``` | |
44 array( | |
45 0: Stmt_Echo( | |
46 exprs: array( | |
47 0: Scalar_String( | |
48 value: Hi | |
49 ) | |
50 1: Scalar_String( | |
51 value: World | |
52 ) | |
53 ) | |
54 ) | |
55 ) | |
56 ``` | |
57 | |
58 This matches the structure of the code: An echo statement, which takes two strings as expressions, | |
59 with the values `Hi` and `World!`. | |
60 | |
61 You can also see that the AST does not contain any whitespace information (but most comments are saved). | |
62 So using it for formatting analysis is not possible. | |
63 | |
64 What else can it do? | |
65 -------------------- | |
66 | |
67 Apart from the parser itself this package also bundles support for some other, related features: | |
68 | |
69 * Support for pretty printing, which is the act of converting an AST into PHP code. Please note | |
70 that "pretty printing" does not imply that the output is especially pretty. It's just how it's | |
71 called ;) | |
72 * Support for serializing and unserializing the node tree to XML | |
73 * Support for dumping the node tree in a human readable form (see the section above for an | |
74 example of how the output looks like) | |
75 * Infrastructure for traversing and changing the AST (node traverser and node visitors) | |
76 * A node visitor for resolving namespaced names | |
77 | |
78 [0]: http://en.wikipedia.org/wiki/Static_program_analysis | |
79 [1]: http://en.wikipedia.org/wiki/Abstract_syntax_tree | |
80 [2]: http://php.net/token_get_all |