zsc19 / CTran

Complete code for the proposed CNN-Transformer model for natural language understanding.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CTRAN: CNN-Transformer-based Network for Natural Language Understanding

Implementation on pytorch as described in https://www.sciencedirect.com/science/article/pii/S0952197623011971.

PWCPWC
PWCPWC

Introduction

CTran CNN Transformer Model Architecture This repository contains CTRAN, a CNN-Transformer-based encoder-decoder network for join intent detection and slot filling. For the encoder, BERT is used as word embedding. Then, Convolutional operation is conducted on word embeddings and it's output is restructured using window feature sequence. The final part of the encoder is stacked Transformer encoders. The decoder comprises self-attention and a linear layer to produce output probabilities for intent detection. Finally, we propose Aligned Transformer Decoder followed by a fully connected layer for the slot filling task. For more information, please refer to the EAAI's article.

Requirements

  • A Cuda capable GPU
  • Python
  • Pytorch
  • Jupyter

Dependencies

numpy==1.23.5
scikit_learn==1.2.2
torch==1.13.0
tqdm==4.64.1
transformers==4.25.1

Runtime

On Windows 10 With an RTX 3080 and BERTbase, the approximate training time for ATIS dataset is 46.71 seconds for each epoch. On the same setup, SNIPS training time is 119.5 seconds for each loop.

Citation

If you use any part of our code, please consider citing our paper as follows:

@article{Rafiepour2023,
title = {CTRAN: CNN-Transformer-based network for natural language understanding},
journal = {Engineering Applications of Artificial Intelligence},
volume = {126},
pages = {107013},
year = {2023},
issn = {0952-1976},
doi = {https://doi.org/10.1016/j.engappai.2023.107013},
url = {https://www.sciencedirect.com/science/article/pii/S0952197623011971},
author = {Mehrdad Rafiepour and Javad Salimi Sartakhti},
keywords = {Natural language understanding, Slot-filling, Intent-detection, Transformers, CNN - Transformer encoder, Aligned transformer decoder, BERT, ELMo}
}

License

License
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

Complete code for the proposed CNN-Transformer model for natural language understanding.

License:Apache License 2.0


Languages

Language:Jupyter Notebook 90.4%Language:Python 9.6%