@lbalby

An Approach for Building Lexical-Semantic Resources Based on Heterogeneous Information Sources

, , and . Proceedings of the 30th Annual ACM Symposium on Applied Computing, page 402--408. ACM, (2015)
DOI: 10.1145/2695664.2695896

Abstract

Lexical-semantic resources (LSRs) have an important role in many information retrieval and extraction tasks. However, in order to be effective, LSRs need to cover a broad spectrum of knowledge facets about terms, e.g., encyclopedic, linguistic, and common sense. These different knowledge facets are usually found in different and heterogeneous knowledge sources spread over the Web, turning the task of integrating them into an unified LSR a hard one. In this work, we propose a new approach to automatically build LSRs that are tailored to semantic search engines, i.e., our approach builds LSRs that favor disambiguation and faceted search. Moreover, while most of the related work is limited to using encyclopedic sources only, such as Wikipedia, we instantiate our approach using additional and heterogeneous knowledge sources, such as dictionaries (WordNet), common sense (Open Mind Common Sense), and semantic networks (Conceptnet5). For evaluation, we compare our approach with UBY, a open source and state-of-the-art LSR, in terms of the number of terms covered; the degree of semantic connections established for each term; and the strength of the extracted semantic relations between terms. We conducted experiments in several test sets and show that our approach is superior under the aforementioned aspects.

Links and resources

Tags

community

  • @lbalby
  • @dblp
@lbalby's tags highlighted