Efficient dictionary matching by Aho-Corasick automata of truncated patterns (Q2224111)
From MaRDI portal
scientific article
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Efficient dictionary matching by Aho-Corasick automata of truncated patterns |
scientific article |
Statements
Efficient dictionary matching by Aho-Corasick automata of truncated patterns (English)
0 references
3 February 2021
0 references
Summary: We present a space-efficient data structure for dictionary matching. We truncate patterns to truncated patterns where symbols are \(\ell\)-length substrings of the pattern. By employing the AC automaton of truncated patterns and that of \(\ell\)-length substrings, we simulate the AC automaton of the original pattern set. The new structure is space economical as we apply the prefix merging to substrings of patterns. Using this structure, the dictionary matching runs in \(O(n \log k + \mathrm{tocc} \log k + \mathrm{occ})\) time where \(n\) is the length of the text, \(k\) the number of patterns, occ the number of occurrences of patterns in the text, and tocc the number of occurrences of strings that are longest prefix of each pattern with length of a multiple of \(\ell\).
0 references
dictionary matching
0 references
Aho-Corasick automaton
0 references
truncated patterns
0 references