copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

PURE: A Dataset of Public Requirements Documents

A. Ferrari, G. Spagnolo, and S. Gnesi. 2017 IEEE 25th International Requirements Engineering Conference (RE), page 502-505. (September 2017)
DOI: 10.1109/RE.2017.29

Abstract

This paper presents PURE (PUblic REquirements dataset), a dataset of 79 publicly available natural language requirements documents collected from the Web. The dataset includes 34,268 sentences and can be used for natural language processing tasks that are typical in requirements engineering, such as model synthesis, abstraction identification and document structure assessment. It can be further annotated to work as a benchmark for other tasks, such as ambiguity detection, requirements categorisation and identification of equivalent re-quirements. In the paper, we present the dataset and we compare its language with generic English texts, showing the peculiarities of the requirements jargon, made of a restricted vocabulary of domain-specific acronyms and words, and long sentences. We also present the common XML format to which we have manually ported a subset of the documents, with the goal of facilitating replication of NLP experiments.

Description

PURE: A Dataset of Public Requirements Documents - IEEE Conference Publication

Links and resources

BibTeX key: 8049173
entry type: inproceedings
booktitle: 2017 IEEE 25th International Requirements Engineering Conference (RE)
year: 2017
month: Sep.
pages: 502-505
issn: 2332-6441
DOI: 10.1109/RE.2017.29
url: https://ieeexplore.ieee.org/abstract/document/8049173

@parismic's tags highlighted

dataset
web

Cite this publication

search on

Meta data

Last update 3 years ago
Created 3 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

PURE: A Dataset of Public Requirements Documents

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML PURE: A Dataset of Public Requirements Documents

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

PURE: A Dataset of Public Requirements Documents

Comments and Reviews
(0)