Automatic ambiguity resolution in natural language processing. An empirical approach (Q5961642)

From MaRDI portal
scientific article; zbMATH DE number 982043
Language Label Description Also known as
English
Automatic ambiguity resolution in natural language processing. An empirical approach
scientific article; zbMATH DE number 982043

    Statements

    Automatic ambiguity resolution in natural language processing. An empirical approach (English)
    0 references
    0 references
    26 February 1997
    0 references
    This book, based on the author's PhD dissertation submitted to the Computational Linguistics Program at Carnegie Mellon University in 1995, contains an empirical approach to one of the main problems in natural language analysis, viz. the problem of automatic ambiguity resolution. Using data from the Penn Treebank, three particularly problematic types of syntactic ambiguity in English are dealt with: unknown words, lexical part-of-speech ambiguity (part-of-speech tagging), and prepositional phrase attachment ambiguity. The approach to disambiguation is based on a loglinear model, i.e., a type of statistical model that is able to take into account the interaction between different features and, thus, obtains a Bayesian posterior probability distribution over the response variable that is properly conditioned on the combinations of the explanatory variables. The major result of this study can be summarized as follows: Ambiguity resolution procedures that take into account the interactions between linguistically relevant features (e.g., word capitalized, lexical tag of the first closed class word to the right, part of speech of the object of the PP) obtain higher disambiguation accuracy than procedures that assume independence. This result is derived through a series of experiments that provide a rigorous empirical evaluation of the models considered, and a thorough comparison with methods that have been described previously in the literature.
    0 references
    computational linguistics
    0 references
    syntactic ambiguity
    0 references
    unknown words
    0 references
    prepositional phrase attachment
    0 references

    Identifiers