With the advancement of A.I. technology in recent years, natural language processing technology has been able to solve so many problems. While working as an NLP engineer, I encountered various tasks, and I thought it would be nice to gather and organize the natural language processing tasks I have dealt with in one place. Borrowing Kyubyong's project format, I organized natural language processing tasks with references and example code.
WIKIAutomated Essay ScoringDATAThe Hewlett Foundation: Automated Essay ScoringMODELBERTMODELRoBERTaMODELElectraOFF-THE-SHELFPororo's AES
WIKISpeech RecognitionDATALibriSpeechDATAAISHELL-1DATAKsponSpeechMODELDeep Speech2MODELListen, Attend and SpellMODELWav2vec 2.0OFF-THE-SHELFPororo's ASRCODEExample with KsponSpeech
WIKIDialogue SystemDATAPersona ChatDATAKorean SNS CorpusMODELDialogue GPTCODEExample with Korean SNS Corpus
WIKIDialogue SystemDATAPersona Chat
DATAUbuntu Dialogue CorpusDATAKorean SNS CorpusMODELPoly EncoderCODEExample with Ubuntu Dialogue Corpus
WIKICloze TestINFOMasked-Language-Modeling with BERTMODELBERTMODELRoBERTaOFF-THE-SHELFPororo's Fill in the BlankCODEExample with WikiCorpus
WIKIAutocorrectionDATANUS Non-commercial research/trial corpus licenseDATACornell Movie--Dialogs CorpusOFF-THE-SHELFPororo's GEC
WIKIGraphemeWIKIPhonemeREPRESENTATIVE-DATAMultilingual Pronunciation DataOFF-THE-SHELF-MODELPororo's G2P
PAPERWizard of Wikipedia: Knowledge-Powered Conversational agentsDATAWizard of WikipediaCODEExample with Wizard of Wikipedia
WIKILanguage ModelINFOA beginner’s guide to language modelsMODELGPT3MODELGPT2MODELKen-LMMODELRNN-LMCODEExample with OpenWebText
WIKIReading ComprehensionINFOMachine Reading Comprehension with BERTDATASQuADDATAKorQuadMODELBERTMODELRoBERTaMODELElectraOFF-THE-SHELFPororo's MRCCODEExample with SQuAD & KorQuad
WIKITranslationDATAWMT 2014 English-to-FrenchDATAKorean-English translation corpusMODELTransformerOFF-THE-SHELFPororo's TranslationCODEExample with Korean-English translation corpus
PAPER-WITH-CODEMath Word Problem SolvingDATADeepMind Mathmatics DatasetDATAKMWP (Korean Math Word Problems)CODEExample with KMWP
WIKITextual EntailmentDATAGLUE-MNLIDATAKorNLIMODELBERTMODELRoBERTaMODELElectraOFF-THE-SHELFPororo's NLICODEExample with GLUE-MNLI
WIKINamed Entity RecognitionDATACoNLL-2002 NER corpusDATACoNLL-2003 NER corpusDATANaver NERMODELBERTMODELRoBERTaMODELElectraOFF-THE-SHELFPororo's NERCODEExample with Naver NER
WIKIParaphraseOFF-THE-SHELFPororo's Paraphrase Generation
OFF-THE-SHELFPororo's P2G
WIKISentiment AnalysisDATAGLUE-SSTDATANSMCMODELBERTMODELRoBERTaMODELElectraOFF-THE-SHELFPororo's Sentiment AnalysisCODEExample with NSMC
WIKISemantic SimilarityDATAGLUE-STSDATAKorSTSMODELBERTMODELRoBERTaMODELElectraOFF-THE-SHELFPororo's STSCODEExample with SQuAD
WIKISpeech SynthesisDATALJ SpeechDATACSS10DATAKSSMODELTacotron2MODELFastSpeech2MODELWaveNetMODELHifi-GANOFF-THE-SHELFPororo's TTSCODEExample with LJ-SpeechCODEExample with KSS
WIKIAutomatic SummarizationDATAXSumDATAKorean Summarization CorpusMODELBARTOFF-THE-SHELFPororo's SummarizationCODEExample with XSum
- Soohwan Kim @sooftware
- Contacts: sh951011@gmail.com