Good-Sentences

Abstract

However, previous endeavors to (connect separate monolingual word embeddings) typicaly require (cross-lingual signals as supervision), either in the form of (parallel corpus or seed lexicon). e.g.,
We carry out evaluation on the (unsupervised bilingual lexicon induction) task. Even though this task appears intrinsically (cross-lingual), we are able to demonstrate encouraging performance without any (cross-lingual) clues. e.g.,

Soon following the success on (monolingual tasks), the potential of (word embeddings for cross-lingual natural language processing) has attracted much attention. In their pioneering work, ().
This has far-reaching implication on low-resource scenarios (Daume et al., 2011).
The interesting findings is in line with research on human cognition (Youn et al., 2016).
This is unfortunate for low-resource languages and domains, because data encoding cross-lingual equivalence is often expensive to obtain.
The generator aims to make the transformed embeddings not only indistinguishable by the discriminator, but also recoverable as measured by the reconstruction loss $ \mathcal{L}_{R} $.

In contrast, our work completely removes the need for cross-lingual signals to connect monolingual word embeddings, trained on non-parallel text corpora.

Generative adversarial networks are notoriously difficult to train, and investigation into stabler training remains a research frontier (Radford et al., 2015; Salimans et al., 2016; Ar..).

In this work, we demonstrate the feasibility of connecting word embeddings of different languages without any cross=lingual signal. This is achived by matching the distributions of the transformed source language embeddings and target ones via adversarial training. The success of our approach signifies the existence of universal lexical semantic structure across languages. Our work also open up opportunities for the processing of extrmely low-resource languages and domains that lack parallel data completely.