messense / jieba-rs

The Jieba Chinese Word Segmentation Implemented in Rust

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

load_dict stop if user_dict parse error, and no error information printed

aohan237 opened this issue · comments

on lib.rs 288 line

map_err still behave likes unwrap,and you can't get infomation about what is wrong.

maybe change the code

-                let freq = parts
-                    .get(1)
-                    .map(|x| {
-                        x.parse::<usize>()
-                            .map_err(|e| Error::InvalidDictEntry(format!("{}", e)))
-                    })
-                    .unwrap_or(Ok(0))?;

like this, then users can know what's the matter. you can decide to continue or stop reading.

+                let freq = {
+                    if let Some(m_freq) = parts.get(1) {
+                        if let Ok(mm_freq) = m_freq.parse::<usize>() {
+                            mm_freq
+                        } else {
+                            println!("parse errorr {:?}", &parts);
+                            0
+                        }
+                    } else {
+                        println!("get nothing {:?}", &parts);
+                        0
+                    }
+                };

I don't think printlns are needed, we should make a better Error::InvalidDictEntry to include more information about the parsing error.

yes, more information needed. i just make a demo to show information
any thing better implement is welcome.

map_error still cause panic,where has an error.
but if you have a large user_dict,maybe something about 100M, i will want all the error lines ,rather than try every time when i correct the error line

I would prefer to keep the error logic short-circuited as it is right now, but improve the user experience by providing a separate dictionary validation tool. We could redirect the user to use that validation tool in the error message. I agree that the error msg needs to be improved.