Recipe 13.14. Removing HTML and PHP Tags


13.14.1. Problem

You want to remove HTML and PHP tags from a string or file. For example, you want to make sure there is no HTML in a string before printing it or PHP in a string before passing it to eval( ).

13.14.2. Solution

Use strip_tags( ) to remove HTML and PHP tags from a string, as shown in Example 13-52.

Removing HTML and PHP tags

<?php $html = '<a href="http://www.oreilly.com">I <b>love computer books.</b></a>          <?php echo "Hello!" ?>'; print strip_tags($html); ?>

Example 13-52 prints:

I love computer books.

To strip tags from a stream as you read it, use the string.strip_tags stream filter, as shown in Example 14-2.

Removing HTML and PHP tags from a stream

<?php $stream = fopen('elephant.html','r'); stream_filter_append($stream, 'string.strip_tags'); print stream_get_contents($stream); ?>

13.14.3. Discussion

Both strip_tags( ) and the string.strip_tags filter can be told not to remove certain tags. Provide a string containing of allowable tags to strip_tags( ) as a second argument. The tag specification is case insensitive, and for pairs of tags, you only have to specify the opening tag. For example, to remove all but <b></b><i></i> tags from $html, call strip_tags($html,'<b><i>').

With the string.strip_tags filter, pass a similar string as a fourth argument to stream_filter_append( ). The third argument to stream_filter_append( ) controls whether the filter is applied on reading (STREAM_FILTER_READ), writing (STREAM_FILTER_WRITE), or both (STREAM_FILTER_ALL). Example 13-54 does what Example 14-2 does, but allows <b></b><i></i> tags.

Removing some HTML and PHP tags from a stream

<?php $stream = fopen('elephant.html','r'); stream_filter_append($stream, 'string.strip_tags',STREAM_FILTER_READ,'<b><i>'); print stream_get_contents($stream); ?>

stream_filter_append( ) also accepts an array of tag names instead of a string: array('b','i') instead of '<b><i>'.

Whether with strip_tags( ) or the stream filter, attributes are not removed from allowed tags. This means that an attribute that changes display (such as style) or executes JavaScript (any event handler) is preserved. If you are displaying "stripped" text of arbitrary origin in a web browser to a user, this could result in cross-site scripting attacks.

13.14.4. See Also

Documentation on strip_tags( ) at http://www.php.net/strip-tags, on stream_filter_append( ) at http://www.php.net/stream_filter_append, and stream filters at http://www.php.net/filters. The PEAR package HTML_Safe attempts to remove unsafe content from HTML and is available at http://pear.php.net/package/HTML_Safe. 18.4 has more details on cross-site scripting.




PHP Cookbook, 2nd Edition
PHP Cookbook: Solutions and Examples for PHP Programmers
ISBN: 0596101015
EAN: 2147483647
Year: 2006
Pages: 445

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net