zjcanjux / RNN_Joint_NLU_Chinese

Train Joint_NLU model using Chinese 中文意图和槽联合模型 tensorflow实现和pytorch实现

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Joint_NLU Chinese

Discription

意图识别(intent detect)和填槽(slot filling)两个任务的联合模型

用seq2seq框架整合在一个模型中实现

Encoder使用tf.nn.bidirectional_dynamic_rnn实现

Decoder使用tf.contrib.seq2seq.dynamic_decode实现

Usage

The file of seq_slot_intent.text is the example for data which is suitable for the B_I_slot_label.py.

Put Chinese Word embeddings file into Word2Vec folder

python3 tain_and_test.py

输出:

Slot Prediction       :  ['B-金额', 'O', 'O', 'O']
Intent Truth          :  开通流量
Intent Prediction     :  开通流量
Intent accuracy for epoch 4: 0.8425925925925926
Slot accuracy for epoch 4: 0.9632771300764675
Slot F1 score for epoch 4: 0.9677923702313946
global_step 5040
[Epoch 5] Average train loss: 0.09832338647591689
Input Sentence        :  ['我', '想', '更改', '宽带', '密码']
Slot Truth            :  ['O', 'O', 'O', 'O', 'B-附属标签']
Slot Prediction       :  ['O', 'O', 'O', 'O', 'B-附属标签']
Intent Truth          :  修改宽带
Intent Prediction     :  修改宽带
Intent accuracy for epoch 5: 0.8564814814814815
Slot accuracy for epoch 5: 0.9589992218974985
Slot F1 score for epoch 5: 0.9642521166509878

Detail

B_I_slot_label.py process the raw data by BIO label method 实现中文语料的标注

data.py convert BIO label data into index data

Add pytorch implementation for Joint model

见 pytorch joint folder 主要是计算效率上的小优化,每个batch计算的句子长度一样,以batch中最长的句子为基准,而不是所有batch都是统一长度。

用到词向量 链接:https://pan.baidu.com/s/1RaEcYVW5n6Dz7-GBtSP_sA 密码:unzz 也可以根据需要用其他的词向量。

Reference:

About

Train Joint_NLU model using Chinese 中文意图和槽联合模型 tensorflow实现和pytorch实现


Languages

Language:Python 100.0%