annotate vendor/zendframework/zend-feed/doc/book/security.md @ 2:92f882872392

Trusted hosts, + remove migration modules
author Chris Cannam
date Tue, 05 Dec 2017 09:26:43 +0000
parents 4c8ae668cc8c
children
rev   line source
Chris@0 1 # Zend\\Feed\\Reader and Security
Chris@0 2
Chris@0 3 As with any data coming from a source that is beyond the developer's control,
Chris@0 4 special attention needs to be given to securing, validating and filtering that
Chris@0 5 data. Similar to data input to our application by users, data coming from RSS
Chris@0 6 and Atom feeds should also be considered unsafe and potentially dangerous, as it
Chris@0 7 allows the delivery of HTML and [xHTML](http://tools.ietf.org/html/rfc4287#section-8.1).
Chris@0 8 Because data validation and filtration is out of `Zend\Feed`'s scope, this task
Chris@0 9 is left for implementation by the developer, by using libraries such as
Chris@0 10 zend-escaper for escaping and [HTMLPurifier](http://www.htmlpurifier.org/) for
Chris@0 11 validating and filtering feed data.
Chris@0 12
Chris@0 13 Escaping and filtering of potentially insecure data is highly recommended before
Chris@0 14 outputting it anywhere in our application or before storing that data in some
Chris@0 15 storage engine (be it a simple file or a database.).
Chris@0 16
Chris@0 17 ## Filtering data using HTMLPurifier
Chris@0 18
Chris@0 19 Currently, the best available library for filtering and validating (x)HTML data
Chris@0 20 in PHP is [HTMLPurifier](http://www.htmlpurifier.org/), and, as such, is the
Chris@0 21 recommended tool for this task. HTMLPurifier works by filtering out all (x)HTML
Chris@0 22 from the data, except for the tags and attributes specifically allowed in a
Chris@0 23 whitelist, and by checking and fixing nesting of tags, ensuring
Chris@0 24 standards-compliant output.
Chris@0 25
Chris@0 26 The following examples will show a basic usage of HTMLPurifier, but developers
Chris@0 27 are urged to go through and read [HTMLPurifier's documentation](http://www.htmlpurifier.org/docs).
Chris@0 28
Chris@0 29 ```php
Chris@0 30 // Setting HTMLPurifier's options
Chris@0 31 $options = [
Chris@0 32 // Allow only paragraph tags
Chris@0 33 // and anchor tags wit the href attribute
Chris@0 34 [
Chris@0 35 'HTML.Allowed',
Chris@0 36 'p,a[href]'
Chris@0 37 ],
Chris@0 38 // Format end output with Tidy
Chris@0 39 [
Chris@0 40 'Output.TidyFormat',
Chris@0 41 true
Chris@0 42 ],
Chris@0 43 // Assume XHTML 1.0 Strict Doctype
Chris@0 44 [
Chris@0 45 'HTML.Doctype',
Chris@0 46 'XHTML 1.0 Strict'
Chris@0 47 ],
Chris@0 48 // Disable cache, but see note after the example
Chris@0 49 [
Chris@0 50 'Cache.DefinitionImpl',
Chris@0 51 null
Chris@0 52 ]
Chris@0 53 ];
Chris@0 54
Chris@0 55 // Configuring HTMLPurifier
Chris@0 56 $config = HTMLPurifier_Config::createDefault();
Chris@0 57 foreach ($options as $option) {
Chris@0 58 $config->set($option[0], $option[1]);
Chris@0 59 }
Chris@0 60
Chris@0 61 // Creating a HTMLPurifier with it's config
Chris@0 62 $purifier = new HTMLPurifier($config);
Chris@0 63
Chris@0 64 // Fetch the RSS
Chris@0 65 try {
Chris@0 66 $rss = Zend\Feed\Reader\Reader::import('http://www.planet-php.net/rss/');
Chris@0 67 } catch (Zend\Feed\Exception\Reader\RuntimeException $e) {
Chris@0 68 // feed import failed
Chris@0 69 echo "Exception caught importing feed: {$e->getMessage()}\n";
Chris@0 70 exit;
Chris@0 71 }
Chris@0 72
Chris@0 73 // Initialize the channel data array
Chris@0 74 // See that we're cleaning the description with HTMLPurifier
Chris@0 75 $channel = [
Chris@0 76 'title' => $rss->getTitle(),
Chris@0 77 'link' => $rss->getLink(),
Chris@0 78 'description' => $purifier->purify($rss->getDescription()),
Chris@0 79 'items' => [],
Chris@0 80 ];
Chris@0 81
Chris@0 82 // Loop over each channel item and store relevant data
Chris@0 83 // See that we're cleaning the descriptions with HTMLPurifier
Chris@0 84 foreach ($rss as $item) {
Chris@0 85 $channel['items'][] = [
Chris@0 86 'title' => $item->getTitle(),
Chris@0 87 'link' => $item->getLink(),
Chris@0 88 'description' => $purifier->purify($item->getDescription()),
Chris@0 89 ];
Chris@0 90 }
Chris@0 91 ```
Chris@0 92
Chris@0 93 > ### Tidy is required
Chris@0 94 >
Chris@0 95 > HTMLPurifier is using the PHP [Tidy extension](http://php.net/tidy) to clean
Chris@0 96 > and repair the final output. If this extension is not available, it will
Chris@0 97 > silently fail, but its availability has no impact on the library's security.
Chris@0 98
Chris@0 99 > ### Caching
Chris@0 100 >
Chris@0 101 > For the sake of this example, the HTMLPurifier's cache is disabled, but it is
Chris@0 102 > recommended to configure caching and use its standalone include file as it can
Chris@0 103 > improve the performance of HTMLPurifier substantially.
Chris@0 104
Chris@0 105 ## Escaping data using zend-escaper
Chris@0 106
Chris@0 107 To help prevent XSS attacks, Zend Framework provides the [zend-escaper component](https://github.com/zendframework/zend-escaper),
Chris@0 108 which complies to the current [OWASP recommendations](https://www.owasp.org/index.php/XSS_Prevention_Cheat_Sheet),
Chris@0 109 and as such, is the recommended tool for escaping HTML tags and attributes,
Chris@0 110 Javascript, CSS and URLs before outputing any potentially insecure data to the
Chris@0 111 users.
Chris@0 112
Chris@0 113 ```php
Chris@0 114 try {
Chris@0 115 $rss = Zend\Feed\Reader\Reader::import('http://www.planet-php.net/rss/');
Chris@0 116 } catch (Zend\Feed\Exception\Reader\RuntimeException $e) {
Chris@0 117 // feed import failed
Chris@0 118 echo "Exception caught importing feed: {$e->getMessage()}\n";
Chris@0 119 exit;
Chris@0 120 }
Chris@0 121
Chris@0 122 // Validate all URIs
Chris@0 123 $linkValidator = new Zend\Validator\Uri;
Chris@0 124 $link = null;
Chris@0 125 if ($linkValidator->isValid($rss->getLink())) {
Chris@0 126 $link = $rss->getLink();
Chris@0 127 }
Chris@0 128
Chris@0 129 // Escaper used for escaping data
Chris@0 130 $escaper = new Zend\Escaper\Escaper('utf-8');
Chris@0 131
Chris@0 132 // Initialize the channel data array
Chris@0 133 $channel = [
Chris@0 134 'title' => $escaper->escapeHtml($rss->getTitle()),
Chris@0 135 'link' => $escaper->escapeUrl($link),
Chris@0 136 'description' => $escaper->escapeHtml($rss->getDescription()),
Chris@0 137 'items' => [],
Chris@0 138 ];
Chris@0 139
Chris@0 140 // Loop over each channel item and store relevant data
Chris@0 141 foreach ($rss as $item) {
Chris@0 142 $link = null;
Chris@0 143 if ($linkValidator->isValid($rss->getLink())) {
Chris@0 144 $link = $item->getLink();
Chris@0 145 }
Chris@0 146 $channel['items'][] = [
Chris@0 147 'title' => $escaper->escapeHtml($item->getTitle()),
Chris@0 148 'link' => $escaper->escapeUrl($link),
Chris@0 149 'description' => $escaper->escapeHtml($item->getDescription()),
Chris@0 150 ];
Chris@0 151 }
Chris@0 152 ```
Chris@0 153
Chris@0 154 The feed data is now safe for output to HTML templates. You can, of course, skip
Chris@0 155 escaping when simply storing the data persistently, but remember to escape it on
Chris@0 156 output later!
Chris@0 157
Chris@0 158 Of course, these are just basic examples, and cannot cover all possible
Chris@0 159 scenarios that you, as a developer, can, and most likely will, encounter. Your
Chris@0 160 responsibility is to learn what libraries and tools are at your disposal, and
Chris@0 161 when and how to use them to secure your web applications.