A Statistical Model for Multilingual Entity Detection and Tracking
R. Florian, H. Hassan, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, and S. Roukos. Proceedings of the Human Language Technologies Conference 2004 (HLT-NAACL'04), page 1--8. Boston, Massachusetts, USA, Association for Computational Linguistics, (May 2004)
Abstract
Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for identifying and tracking named, nominal and pronominal references to entities within unrestricted text documents, and chaining them into clusters corresponding to each logical entity present in the text. Both the mention detection model and the novel entity tracking model can use arbitrary feature types, being able to integrate a wide array of lexical, syntactic and seman- tic features. In addition, the mention detection model crucially uses feature streams derived from different named entity classifiers. The proposed framework is evaluated with several experiments run in Arabic, Chinese and English texts; a system based on the approach described here and submitted to the latest Automatic Content Extraction (ACE) evaluation achieved top-tier results in all three evaluation languages.
%0 Conference Paper
%1 citeulike:821984
%A Florian, Radu
%A Hassan, Hany
%A Jing, Hongyan
%A Kambhatla, Nanda
%A Luo, Xiaqiang
%A Nicolov, Nicolas
%A Roukos, Salim
%B Proceedings of the Human Language Technologies Conference 2004 (HLT-NAACL'04)
%C Boston, Massachusetts, USA
%D 2004
%E Marcu, Daniel
%E Dumais, Susan
%E Roukos, Salim
%I Association for Computational Linguistics
%K named-entity
%P 1--8
%T A Statistical Model for Multilingual Entity Detection and Tracking
%U http://acl.ldc.upenn.edu/hlt-naacl2004/main/pdf/128\_Paper.pdf
%X Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for identifying and tracking named, nominal and pronominal references to entities within unrestricted text documents, and chaining them into clusters corresponding to each logical entity present in the text. Both the mention detection model and the novel entity tracking model can use arbitrary feature types, being able to integrate a wide array of lexical, syntactic and seman- tic features. In addition, the mention detection model crucially uses feature streams derived from different named entity classifiers. The proposed framework is evaluated with several experiments run in Arabic, Chinese and English texts; a system based on the approach described here and submitted to the latest Automatic Content Extraction (ACE) evaluation achieved top-tier results in all three evaluation languages.
@inproceedings{citeulike:821984,
abstract = {Entity detection and tracking is a relatively new addition to the repertoire of natural language tasks. In this paper, we present a statistical language-independent framework for identifying and tracking named, nominal and pronominal references to entities within unrestricted text documents, and chaining them into clusters corresponding to each logical entity present in the text. Both the mention detection model and the novel entity tracking model can use arbitrary feature types, being able to integrate a wide array of lexical, syntactic and seman- tic features. In addition, the mention detection model crucially uses feature streams derived from different named entity classifiers. The proposed framework is evaluated with several experiments run in Arabic, Chinese and English texts; a system based on the approach described here and submitted to the latest Automatic Content Extraction (ACE) evaluation achieved top-tier results in all three evaluation languages.},
added-at = {2009-07-01T11:12:30.000+0200},
address = {Boston, Massachusetts, USA},
author = {Florian, Radu and Hassan, Hany and Jing, Hongyan and Kambhatla, Nanda and Luo, Xiaqiang and Nicolov, Nicolas and Roukos, Salim},
biburl = {https://www.bibsonomy.org/bibtex/25a3c5bbf8e8ee4fafd1173f7944d19b9/brusilovsky},
booktitle = {Proceedings of the Human Language Technologies Conference 2004 (HLT-NAACL'04)},
citeulike-article-id = {821984},
editor = {Marcu, Daniel and Dumais, Susan and Roukos, Salim},
interhash = {6891d15b902462a5034bd28fd070f4c9},
intrahash = {5a3c5bbf8e8ee4fafd1173f7944d19b9},
keywords = {named-entity},
month = May,
pages = {1--8},
posted-at = {2007-12-26 17:58:46},
priority = {1},
publisher = {Association for Computational Linguistics},
timestamp = {2009-07-01T11:12:36.000+0200},
title = {A Statistical Model for Multilingual Entity Detection and Tracking},
url = {http://acl.ldc.upenn.edu/hlt-naacl2004/main/pdf/128\_Paper.pdf},
year = 2004
}