Bug with smart quotes processing when quoted text is directly followed by HTML
gchtr opened this issue · comments
First of all, thanks for that awesome library
I have an issue when with smart quotes processing, when a quoted text is directly followed by a HTML element without spaces around.
As soon as I put spaces on the left or the right of the HTML element, it starts working correctly.
Example
Input
"Text."<br>Text after.
Expected output
«Text.»<br>Text after.
Actual output
«Text.«<br>Text after.
Test case
Here’s a test case to visualize that. I tried writing a new test for this, but I didn’t find the right place to put it :).
<?php
use PHP_Typography\PHP_Typography;
use PHP_Typography\Settings;
require_once './vendor/autoload.php';
$settings = new Settings();
$settings->set_smart_quotes_primary( 'doubleGuillemets' );
$settings->set_smart_quotes_secondary( 'singleGuillemets' );
$typo = new PHP_Typography();
echo '<h2>Fail</h2>';
// These don’t work. The second " or ' display as «/‹ instead of »/›.
var_dump( $typo->process( '"Text."<br>Text after.', $settings ) );
var_dump( $typo->process( '"Text."<strong>Text after.</strong>', $settings ) );
var_dump( $typo->process( "'Text.'<br>Text after.", $settings ) );
var_dump( $typo->process( "'Text.'<strong>Text after.</strong>", $settings ) );
echo '<h2>Success</h2>';
// A space before or after the HTML element does work.
var_dump( $typo->process( '"Text."<br> Text after.', $settings ) );
var_dump( $typo->process( '"Text."<strong> Text after.</strong>', $settings ) );
var_dump( $typo->process( '"Text." <br>Text after.', $settings ) );
var_dump( $typo->process( '"Text." <strong>Text after.</strong>', $settings ) );
var_dump( $typo->process( "'Text.'<br> Text after.", $settings ) );
var_dump( $typo->process( "'Text.'<strong> Text after.</strong>", $settings ) );
var_dump( $typo->process( "'Text.' <br>Text after.", $settings ) );
var_dump( $typo->process( "'Text.' <strong>Text after.</strong>", $settings ) );
Thank you for the report. That is happening because the "following" letter is taken into account (unless there are block-level elements in between - I'll have to check what the DOM tree actually looks like here, but br
will probably need special handling.