A PHP package to convert HTML into plain text -- no HTML tags allowed in the output.
Overview
masroore/html2text is a PHP package that converts a page of HTML into clean, easy-to-read plain ASCII text.
Installation
Requires PHP 8.0+
You can install the package via composer:
composer require masroore/html2text
Usage
Extract text from HTML:
use Kaiju\Html2Text\Html2Text;
$converter = new Html2Text();
echo $converter->convert($html);
Callback functions
You are able to change process of formatting by providing callbacks in pre-processing, tag-replacing and post-processing:
# assign a pre-processing callback function. (transform href links)
$converter->setPreProcessingCallback(fn (string $s) => preg_replace('%<\s*a[^>]*href=[\'"](.*?)[\'"][^>]*>([\s\S]*?)<\/\s*a\s*>%i', '$2 ($1)', $s));
# assign a tag-replacement callback function. (replace <li> tags)
$converter->setTagReplacementCallback(fn (string $s) => preg_replace('/<\s*li[^>]*>/i', "\n- ", $s));
# post-processing hook
$converter->setPostProcessingCallback(...);
# process HTML
echo $converter->convert($html);
Testing
composer test
Changelog
Please see CHANGELOG for more information on what has changed recently.
Contributing
Thank you for considering to contribute to Html2Text. All the contribution guidelines are mentioned here.
Security Vulnerabilities
Please review our security policy on how to report security vulnerabilities.
Credits
License
Html2Text is an open-sourced software licensed under the MIT license.