phejimlin / Natural-Language-Processing

A course provided by NTHU ISA in 2017 spring.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Natural-Language-Processing

A course provided by NTHU ISA in 2017 spring.

Lab1_Spell Checker

Goal:

  • Fusion errors (e.g. “taketo” → “take to”)
  • Multi-token errors (e.g. “mor efun” → “more fun”)
  • Fusion errors (e.g. “with out” → “without”)

Reference : http://norvig.com/spell-correct.html

Goal is to fix sentence error.

Example:

  • make a depe hole -> make a deep hole
  • an ansion method of hunting -> an ancient method of hunting
  • at brake time -> at break time

Result:

  • Test case 1 is to correct wrong sentence.

  • hits: 198 corrections: 90 error: 108

  • Test case 2 is to correct right sentence and see false alarm.

  • hits: 198 corrections: 116 error: 82

Goal is to find Collocation word by using skip bigrams and Smajda's Algorithm.

Code here.

Goal is to use lmr to run Collocation Extraction by local map-reduce.

About

A course provided by NTHU ISA in 2017 spring.


Languages

Language:Python 44.2%Language:OpenEdge ABL 40.8%Language:Jupyter Notebook 12.7%Language:HTML 2.3%Language:Shell 0.0%