FernandoLpz / AuthorVerificiation

This repository shows up a siamese arquitectue proposed to solve the problem of author verification particularly the problem about given a pair of documents decide if both are from the same author or not based on their writting style. The siamese arquitecture is composed by an assemble of two convolutional layers and a LSTM recurrent neurnal net followed by a euclidean distance.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Author Verification using a Siamese Arquitecture

This repository shows up a siamese arquitectue proposed to solve the problem of author verification particularly the problem about given a pair of documents decide if both are from the same author or not based on their writting style. The siamese arquitecture is composed by an assemble of two convolutional layers and a LSTM recurrent neurnal net followed by a euclidean distance.

The framework implemented was Keras 2.0 on Python 3.5.2.

Introduction

The model proposed is based on the idea that it can learn a function that may decide that given a pair of documents if both are from the same author or not based on their writting style.

The model

The model receives two inputs wich are a pair of sequences of character embeddings that represents a document from a author given each one. Each sequence pass through a first convolutional layer wich extract local features from fragments of the sequence, next the second convolutional layer extracts features from the features extracted from the first convolutional layer, next the maxpooling layer extracts the most important feature wich describes the author's style, after that these features are passed through a LSTM wich learns the sign or the author's style.

Citation

@article{FerLpzV2018,
author = {Fernando-López, Gibrán-Fuentes and Fabian-Garcia},
title = {Authorship Verification from Texts through Convolutional and Recurrent Neural Networks},
year = {2018},
}

About

This repository shows up a siamese arquitectue proposed to solve the problem of author verification particularly the problem about given a pair of documents decide if both are from the same author or not based on their writting style. The siamese arquitecture is composed by an assemble of two convolutional layers and a LSTM recurrent neurnal net followed by a euclidean distance.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%