unee-t / frontend

Meteor front end

Home Page:https://case.dev.unee-t.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Chinese characters are not displayed in the email notifications

franck-boullier opened this issue · comments

If the user types a chinese character in a Unee-T case, the email notification is not able to display this correctly

image

Any chance if the email could be forwarded to me as an attachment please?

Btw, this is in the demo environment AFAICT.

The case title is literally ????? in RequestIDs "17201dcb-6172-4946-a37b-07dfd17c5c3f" & "8e2a2047-7045-4b4c-887d-2d7f5a3cdaf5". I.e. the payload generated has the Unicode mangled.

aws --profile uneet-demo logs filter-log-events --log-group-name "/aws/lambda/alambda_simple" --start-time 1561967459000 --filter-pattern '7206'

Sorry, there was a reply on Jul 19th:


To provide some base reference to this issue, as you might already be aware MySQL’s utf8 (the default) only implements UTF-8 encoding partially  (65,536 code points in the range  from U+0000 to U+FFFF called BMP - Basic Multilingual Plane). It support BMP characters only as it can only store a maximum of three bytes per multi-byte character. UTF-8-encoded symbols that take up four bytes are not supported. 
Thus, when an attempt to insert strings of the form you attempted (which contain 4 bytes per character) is made, an 'incorrect string value' warning is thrown (in the case of MySQL 5.6 compatible instances) and the string value gets 'mangled'.

An example of the below in my test environment using just a plain table :
mysql> insert into t1 values(2, "但你的愛是");
Query OK, 1 row affected, 1 warning (0.21 sec)

mysql> show warnings;
+---------+------+----------------------------------------------------------------------------------+
| Level   | Code | Message                                                                          |
+---------+------+----------------------------------------------------------------------------------+
| Warning | 1366 | Incorrect string value: '\xE4\xBD\x86\xE4\xBD\xA0...' for column 'name' at row 1 |
+---------+------+----------------------------------------------------------------------------------+
1 row in set (0.25 sec)

mysql> select * from t1;
+------+-------+
| id   | name  |
+------+-------+
|    1 | dvfdv |
|    2 | ????? |
+------+-------+
2 rows in set (0.23 sec)

mysql> ALTER TABLE t1
    -> DEFAULT CHARACTER SET utf8mb4,
    -> MODIFY name VARCHAR(100)
    -> CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
Query OK, 2 rows affected (0.30 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 values(2, "但你的愛是");
Query OK, 1 row affected (0.29 sec)

mysql> select * from t1;                                                     
| id   | name            |
+------+-----------------+
|    1 | dvfdv           |
|    2 | ?????           |
|    2 | 但你的愛是      |
+------+-----------------+
3 rows in set (0.22 sec)

Hence, as you can see above we can get past this issue by ensuring we use utf8mb4 instead. You would need to change the 'name' column in this case to use the utf8mb4 character set and collation: Below is the an example query to convert character set and collation:
mysql> ALTER TABLE table_name DEFAULT CHARACTER SET utf8mb4, MODIFY column_name VARCHAR(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

I think it's related to #850 (comment) and is pending on bugzilla/bugzilla#79