copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CamemBERT: a Tasty French Language Model

L. Martin, B. Muller, P. Suárez, Y. Dupont, L. Romary, É. de la Clergerie, D. Seddah, and B. Sagot. ACL, (2020)

Abstract

Pretrained language models are now ubiquitous in Natural Language Processing. Despite their success, most available models have either been trained on English data or on the concatenation of data in multiple languages. This makes practical use of such models --in all languages except English-- very limited. In this paper, we investigate the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating our language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks. We show that the use of web crawled data is preferable to the use of Wikipedia data. More surprisingly, we show that a relatively small web crawled dataset (4GB) leads to results that are as good as those obtained using larger datasets (130+GB). Our best performing model CamemBERT reaches or improves the state of the art in all four downstream tasks.

Links and resources

BibTeX key: martin2019camembert
entry type: inproceedings
booktitle: ACL
year: 2020
url: https://camembert-model.fr/

@schwemmlein's tags highlighted

Cite this publication

search on

Meta data

Last update 4 years ago
Created 4 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CamemBERT: a Tasty French Language Model

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML CamemBERT: a Tasty French Language Model

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CamemBERT: a Tasty French Language Model

Comments and Reviews
(0)