alx-tools / Betty

Holberton-style C code checker written in Perl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unicode characters count as 2 characters.

0100-0100 opened this issue · comments

When using Unicode characters they count as 2 instead of one.

Researching about Unicode support in Perl, believe this is happening due to Perl
counting a Unicode character as 2 bytes depending on the character size in HEX
digits.

─────────────── How to reproduce the problem ───────────────

Here's a sample C comment 81 characters long and the same length of comment
using one Unicode character below.

ASCII comment:
/* Line # 14 A long commentary of exactly 81 characters. /
Unicode comment.
/
Line # 16, with a Unicode character ----> ─ <---- */

──────────────────── Sample image ────────────────────

Here's an attached image showing the output given by betty.

image

──────────────────── Where to look ────────────────────

By reading the Perl code, I believe the problem might be arising around the line
2813 on the conditional statemet to raise the warning message indicated here:
https://github.com/holbertonschool/Betty/blob/438f97cb63fa6ee8d6a8092a4f2fb529e238d1c9/betty-style.pl#L2813