Recipe 13.13. Converting HTML to Plain Text


13.13.1. Problem

You need to convert HTML to readable, formatted plain text.

13.13.2. Solution

Use the html2text class available from http://www.chuggnutt.com/html2text.php. Example 13-51 shows it in action.

Converting HTML to plain text

<?php require_once 'class.html2text.inc'; $html = file_get_contents('http://www.example.com/article.html'); $converter = new html2text($html); $plain_text = $converter->get_text(); ?>

13.13.3. Discussion

The html2text class has a large number of formatting rules built in so your generated plain text has some visual layout for headings, paragraphs, and so on. It also includes a list of all the links in the HTML at the bottom of the text it generates.

13.13.4. See Also

http://www.chuggnutt.com/html2text.php for more information on html2text and links to download it.




PHP Cookbook, 2nd Edition
PHP Cookbook: Solutions and Examples for PHP Programmers
ISBN: 0596101015
EAN: 2147483647
Year: 2006
Pages: 445

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net