Martomate / LocalizeKhan

An improvement to the translation of Khan Academy content from English to Swedish

Home Page:https://localizekhan.herokuapp.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LocalizeKhan

Our solution improves the process of proofreading by automatically classifying translated texts to determine the quality of the translation. The idea is to minimize the amount of tedious manual work of proofreading texts.

The Bayes classifier is built by using the library scikit-learn and natural language toolkit. The API is built with the tool Swagger, see demo below.

Please note: The classifier has not yet been trained with very much data.

Installation

pip install google-cloud-translate Flask flasgger sklearn # See requirements.txt for all dependencies.
export FLASK_APP=app.py

To be able to use Google's service for automatic translation it's required to set up a authentication, see https://cloud.google.com/docs/authentication/getting-started

Usage

First, create a dataset and store it in training_data.txt and then train the classifier:

python khan_clf.py

Run:

flask run

Visit http://localhost:5000/apidocs/ for details how to use the api

Available demo of API docs online

Visit http://localizekhan.herokuapp.com/apidocs/ for trying out the API.

Licence

The MIT License (MIT) Copyright (c) 2017

About

An improvement to the translation of Khan Academy content from English to Swedish

https://localizekhan.herokuapp.com/

License:MIT License


Languages

Language:Python 100.0%