Treebanks. Building and using parsed corpora (Q1414851)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Treebanks. Building and using parsed corpora |
scientific article; zbMATH DE number 2013755
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Treebanks. Building and using parsed corpora |
scientific article; zbMATH DE number 2013755 |
Statements
Treebanks. Building and using parsed corpora (English)
0 references
7 December 2003
0 references
This compilation contains 21 papers on building and using syntactically parsed natural language corpora (so-called treebanks). The topics being covered are the proper choice of the corpus to be annotated, the choice of the kind of annotation to be added (part-of-speech information, phrase or dependency structures, etc.), whether annotation is best done manually or automatically, with which annotation tools and formats, how search can be conducted in annotated corpora, what kind of knowledge can be extracted (i.e., automatically learned) out of them (e.g., stochastic grammars) and how the results are better than extracting (or learning) from non-annotated sources, and finally, how annotated corpora can be used to evaluate current natural language processing tools such as parsers or grammar checkers. The papers presented deal with a variety of languages, including Chinese, Czech, English, French, German, Italian, Japanese, Polish, Portuguese, Spanish, and Turkish. The articles of this volume will not be indexed individually.
0 references
text corpus
0 references
corpus annotation
0 references
annotation tool
0 references
annotation language
0 references
part-of-speech tagging
0 references
treebanks
0 references
parsed corpora
0 references