Jaume Zaragoza (ZJaume)

ZJaume

Geek Repo

Company:Prompsit Language Engineering

Github PK Tool:Github PK Tool


Organizations
bitextor
macocu
paracrawl
Prompsit

Jaume Zaragoza's repositories

clean

A tool for downloading and cleaning parallel corpora

Language:PythonLicense:NOASSERTIONStargazers:3Issues:0Issues:0

image_omr

Optical Music Recognition with RNN's in Keras

Language:PythonLicense:GPL-3.0Stargazers:3Issues:0Issues:0

paraphrasing

A repository with different paraphrasing related tools. Sent2vec and paraphrase generation.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

terminology

Tools to annotate parallel data with terminology for NMT forced translation

Language:PythonLicense:GPL-3.0Stargazers:1Issues:0Issues:0

tmxt

Transform TMX to text

Language:PythonLicense:GPL-3.0Stargazers:1Issues:0Issues:0

arch-install

Simple bash script to install Arch Linux.

Language:ShellLicense:NOASSERTIONStargazers:0Issues:0Issues:0

bicleaner

Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

Computer-Vision

Computer vision repository

Language:Jupyter NotebookLicense:GPL-3.0Stargazers:0Issues:0Issues:0

cyrillic-transliteration

Transliterate Cyrillic script to Latin script and vice versa.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble

License:MITStargazers:0Issues:0Issues:0

diceware-cat

Diccionaris catalans per a generar contrasenyes Diceware

Language:TeXStargazers:0Issues:1Issues:0

Domain_Adaptation

InDomain detection is a tool designed to extract in-domain data from a large collections of data.

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

dotfiles

My dotfiles

Language:CSSLicense:MITStargazers:0Issues:0Issues:0

escape-unk

Escape unknown symbols in SentecePiece vocabularies

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

fastspell

Targetted language identifier, based on FastText and Hunspell.

Language:PythonStargazers:0Issues:0Issues:0

gaoya

Locality Sensitive Hashing

Language:RustLicense:MITStargazers:0Issues:0Issues:0

Infinity-For-Reddit

A Reddit client for Android

License:AGPL-3.0Stargazers:0Issues:0Issues:0

LanguagePack

A language pack project for AnySoftKeyboard

Language:HTMLStargazers:0Issues:0Issues:0

lttoolbox

Finite state compiler, processor and helper tools used by apertium

Language:C++License:GPL-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

sacrebleu

Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

serde-fancy-regex

A serde-regex fork to (de)serialize fancy-regex regular expressions

Language:RustLicense:Apache-2.0Stargazers:0Issues:0Issues:0

splitters

A CLI for Rust SRX sentence segmenation rules as Python package.

Language:RustLicense:GPL-3.0Stargazers:0Issues:1Issues:3

srx

A mostly compliant Rust implementation of the Segmentation Rules eXchange (SRX) 2.0 standard for text segmentation.

Language:RustLicense:Apache-2.0Stargazers:0Issues:0Issues:0

students

Efficient teacher-student models and scripts to make them

Language:HandlebarsLicense:NOASSERTIONStargazers:0Issues:0Issues:0