LUS midterm project
Midterm project for the Language Understanding System course 2019/2020 @ UniTN.
Concept tagging for the movie domain. The models were trained and performed using the following dataset: NL2SparQL4NLU.
Results - SCLM
Baseline | Remove of OoS | Use of POS tags | Normalization | F1-score | Accuracy | Precision | Recall |
---|---|---|---|---|---|---|---|
Output symbol priors | 0.0036 | 0.0255 | 0.0036 | 0.0036 | |||
Random path | 0.0268 | 0.4982 | 0.0218 | 0.0348 | |||
None | 0.7269 | 0.9165 | 0.7600 | 0.6966 | |||
MLE | 0.7223 | 0.9127 | 0.7197 | 0.7250 | |||
MLE | ☑️ | 0.7466 | 0.9055 | 0.6883 | 0.8157 | ||
MLE | ☑️ | ☑️ | 0.8119 | 0.9421 | 0.8046 | 0.8194 | |
MLE | ☑️ | ☑️ | ☑️ | 0.8021 | 0.9380 | 0.7960 | 0.8084 |
Results - HMM
Estimator | Remove of OoS | Use of POS tags | Normalization | F1-score | Accuracy | Precision | Recall |
---|---|---|---|---|---|---|---|
Laplace | 0.6326 | 0.8871 | 0.7237 | 0.5618 | |||
ELE | 0.6948 | 0.9050 | 0.7406 | 0.6544 | |||
Lindstone 0.05 | 0.7078 | 0.9074 | 0.7062 | 0.7094 | |||
Lindstone 0.15 | 0.7093 | 0.9099 | 0.7129 | 0.7057 | |||
Witten Bell | 0.7159 | 0.9124 | 0.7512 | 0.6837 | |||
MLE | 0.7173 | 0.9086 | 0.7719 | 0.6700 | |||
Lindstone 0.1 | 0.7213 | 0.9130 | 0.7213 | 0.7213 | |||
Witten Bell + Witten Bell | ☑️ | 0.7862 | 0.9326 | 0.7821 | 0.7873 | ||
Witten Bell + Witten Bell + Witten Bell | ☑️ | ☑️ | 0.7576 | 0.9277 | 0.7618 | 0.7534 | |
Witten Bell + Witten Bell + Witten Bell + Witten Bell | ☑️ | ☑️ | ☑️ | 0.7311 | 0.9178 | 0.7318 | 0.7305 |