Panic due to byte access between char boundary

Question

Panic due to byte access between char boundary

MakotoE opened this issue 3 years ago · comments

When you run this unit test, it panics with byte index 3 is not a char boundary.

    #[test]
    fn test() {
        let s = "\u{1b}!Ͽ";
        fill(s, 10);
    }

Makoto · Answer 1 · Sat Jun 26 2021 01:22:51 GMT+0800 (China Standard Time)

I'm just guessing here, but was this piece of code supposed to be like:

diff --git a/src/word_separators.rs b/src/word_separators.rs
index 81ec6d4..abac756 100644
--- a/src/word_separators.rs
+++ b/src/word_separators.rs
@@ -216,7 +216,7 @@ impl WordSeparator for UnicodeBreakProperties {
         let mut opportunities = unicode_linebreak::linebreaks(&stripped)
             .filter(|(idx, _)| {
                 #[allow(clippy::match_like_matches_macro)]
-                match &line[..*idx].chars().next_back() {
+                match &stripped[..*idx].chars().next_back() {
                     // We suppress breaks at ‘-’ since we want to control
                     // this via the WordSplitter.
                     Some('-') => false,

Martin Geisler · Answer 2 · Sat Jun 26 2021 02:10:13 GMT+0800 (China Standard Time)

Hi @MakotoE, thank you very much for the report! I think you're right — it seems very suspicious that we iterate over stripped but slice line.

I'm not at a computer right now, but I assume your test no longer fails if you swap the variables? If so, could you open a PR? Thanks!