So Miyagawa (somiyagawa)

somiyagawa

Geek Repo

Company:University of Tsukuba

Location:Tsukuba, Japan

Home Page:https://somiyagawa.com/

Twitter:@So_Miyagawa

Github PK Tool:Github PK Tool


Organizations
CopticScriptorium
KELLIA
NINJAL-CPCR
UniversalDependencies

So Miyagawa's repositories

Language:CSSLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

coptic-xml-tool

coptic scriptorium xml editor tool

Language:JavaScriptStargazers:0Issues:0Issues:0

SINUHE

Sublime INput method of Unicode for Hieroglyphic Egyptian

License:MITStargazers:4Issues:0Issues:0

treebank_data

Perseus Treebank Data

Language:HTMLStargazers:0Issues:0Issues:0

SunoikisisDC-2016

Planning Seminar and SS 2016 Course

Stargazers:0Issues:0Issues:0

proiel

A library for working with PROIEL treebanks

Language:RubyLicense:MITStargazers:1Issues:0Issues:0

KR6a0005

佛般泥洹經-西晉-白法祖

Stargazers:0Issues:0Issues:0

mozc

Mozc - a Japanese Input Method Editor designed for multi-platform

Language:C++License:BSD-3-ClauseStargazers:0Issues:0Issues:0

deipnosophistae-reuses

Citable analyses of quotations and text reuses in the Deipnosophistae

Stargazers:1Issues:0Issues:0

homeric-reuse

Citable analyses of Homeric text reuse in the Deipnosophistae

Stargazers:1Issues:0Issues:0

canonical-greekLit

XML Canonical resources for Greek Literature

Language:XQueryStargazers:0Issues:0Issues:0

isri-ocr-evaluation-tools

Automatically exported from code.google.com/p/isri-ocr-evaluation-tools

Language:CStargazers:0Issues:0Issues:0

tesseract

Tesseract Open Source OCR Engine (main repository)

Language:C++License:NOASSERTIONStargazers:0Issues:0Issues:0

ANNIS

ANNIS is an open source, versatile web browser-based search and visualization architecture for complex multilevel linguistic corpora with diverse types of annotation.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ocropy

Python-based tools for document analysis and OCR

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

PoCoTo

Home of the Postcorrection Tool

Language:JavaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

canonical

This will be the base repo for all text and annotation data published in the PDL

Stargazers:0Issues:0Issues:0
Stargazers:1Issues:0Issues:0

normalizer

Normalizes orthography

Stargazers:0Issues:0Issues:0

lexical-taggers

lexical taggers (language of origin, lemmatizer) for Sahidic Coptic

Stargazers:0Issues:0Issues:0

keyboard

JavaScript keyboard

Stargazers:0Issues:0Issues:0

corpora-legacy-releases

Corpora-Legacy-Releases

Stargazers:0Issues:0Issues:0

tokenizers

Coptic SCRIPTORIUM Tokenization Script

Stargazers:0Issues:0Issues:0

converter-complex-python

Encode text from legacy ASCII font by Van Damme & Wurst to UTF-8

Stargazers:1Issues:0Issues:0

TheoryOfComputation

A memo of a lecture on Theory of Computation in the University of Tokyo.

Language:TeXStargazers:0Issues:0Issues:0

WebAlgo-Java-Class

Because most of the code I write is closed source and I wanted to give others a peek into my Java world. So, I have available, upon request, source code and documentation from a Web Algorithms class, part of a Java Certification track at UCB, done while I was independently learning Java. Granted it is probably not up to what I do today but it is at least something for people to look at should they feel the need to know that I have Java experience. During the class I wrote things as diverse as a grails gwt mashup plugin for Eclipse, all the way to a lingpipe, lucene based document classifier, and grammar processor for word separation in Coptic. The classifier (K-Means) and processor was capable of Coptic word splitting with high accuracy armed with very little training data and could distinguish between poetry, religious text and business documents.

Language:HTMLLicense:LGPL-2.1Stargazers:0Issues:0Issues:0