This project is about using text from the Bible to train and develop a model
capable of identifying the language of a given passage. In particular we focused
on Italic-Romance languages, Germanic languages, and the non-related language Tagalog.
Language Codes (ISO-639-1 Standard)
Code
Language
Group
Version Used
ca
Catalan
Italic-Romance
Bíblia Catalana Interconfessional (BCI)
da
Danish
Germanic
Bibelen på hverdagsdansk (BPH)
de
German
Germanic
Schlachter 2000 (SCH2000)
en
English
Germanic
21st Century King James Version (KJ21)
es
Spanish
Italic-Romance
Reina-Valera 1960 (RVR1960)
fr
French
Italic-Romance
Louis Segond (LSG)
is
Icelandic
Germanic
Icelandic Bible (ICELAND)
it
Italian
Italic-Romance
La Nuova Diodati (LND)
la
Latin
Italic-Romance
Biblia Sacra Vulgata (VULGATE)
nl
Dutch
Germanic
Het Boek (HTB)
no
Norwegian
Germanic
En Levende Bok (LB)
pt
Portuguese
Italic-Romance
Almeida Revista e Corrigida 2009 (ARC)
ro
Romanian
Italic-Romance
Cornilescu 1924 - Revised 2010, 2014 (RMNN)
sv
Swedish
Germanic
Nya Levande Bibeln (SVL)
tl
Tagalog
Other
Ang Dating Biblia (1905) (ADB1905)
Book
Chapter
Genesis
1
Exodus
20
1 Kings
17
Psalm
119
Isaiah
53
Daniel
6
Habakkuk
2
Matthew
6
Mark
10
John
1
John
3
Romans
12
Galatians
5
Colossians
6