Combining sequence and itemset mining to discover named entities in biomedical texts: a new type of pattern (Q1046578)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Combining sequence and itemset mining to discover named entities in biomedical texts: a new type of pattern |
scientific article; zbMATH DE number 5651379
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Combining sequence and itemset mining to discover named entities in biomedical texts: a new type of pattern |
scientific article; zbMATH DE number 5651379 |
Statements
Combining sequence and itemset mining to discover named entities in biomedical texts: a new type of pattern (English)
0 references
22 December 2009
0 references
Summary: Biomedical named entity recognition (NER) is a challenging problem. In this paper, we show that mining techniques, such as sequential pattern mining and sequential rule mining, can be useful to tackle this problem but present some limitations. We demonstrate and analyse these limitations and introduce a new kind of pattern called LSR pattern that offers an excellent trade-off between the high precision of sequential rules and the high recall of sequential patterns. We formalise the LSR pattern mining problem first. Then, we show how LSR patterns enable us to successfully tackle biomedical NER problems. We report experiments carried out on real datasets that underline the relevance of our proposition.
0 references
LSR patterns
0 references
left-sequence-right patterns
0 references
sequential patterns
0 references
biomedical NER
0 references
named entity recognition
0 references
constraint-based pattern mining
0 references
biomedical texts
0 references
sequential rule mining
0 references
gene names
0 references
protein names
0 references
text mining
0 references
information extraction
0 references