Article,

LEARNING CROSS-LINGUAL WORD EMBEDDINGS WITH UNIVERSAL CONCEPTS

P. Sheinidashtego, and A. Musaev.
International Journal on Web Service Computing (IJWSC), 10 (1/2/3): 13-20 (September 2019)
DOI: 10.5121/ijwsc.2019.10302

Full text

Abstract

Recent advances in generating monolingual word embeddings based on word co-occurrence for universal languages inspired new efforts to extend the model to support diversified languages. State-of-the-art methods for learning cross-lingual word embeddings rely on the alignment of monolingual word embedding spaces. Our goal is to implement a word co-occurrence across languages with the universal concepts’ method. Such concepts are notions that are fundamental to humankind and are thus persistent across languages, e.g., a man or woman, war or peace, etc. Given bilingual lexicons, we built universal concepts as undirected graphs of connected nodes and then replaced the words belonging to the same graph with a unique graph ID. This intuitive design makes use of universal concepts in monolingual corpora which will help generate meaningful word embeddings across languages via the word cooccurrence concept. Standardized benchmarks demonstrate how this underutilized approach competes SOTA on bilingual word sematic similarity and word similarity relatedness tasks.

BibTeX key: noauthororeditor
entry type: article
year: 2019
month: September
journal: International Journal on Web Service Computing (IJWSC)
number: 1/2/3
pages: 13-20
volume: 10
language: English
issn: 0976 - 9811 (Online); 2230 - 7702 (print)
DOI: 10.5121/ijwsc.2019.10302
Document: https://aircconline.com/ijwsc/V10N3/10319ijwsc02.pdf

BibSonomy

LEARNING CROSS-LINGUAL WORD EMBEDDINGS WITH UNIVERSAL CONCEPTS

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on