Probabilistic counting algorithms for data base applications (Q1069325)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Probabilistic counting algorithms for data base applications |
scientific article; zbMATH DE number 3934444
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Probabilistic counting algorithms for data base applications |
scientific article; zbMATH DE number 3934444 |
Statements
Probabilistic counting algorithms for data base applications (English)
0 references
1985
0 references
This paper introduces a class of probabilistic counting algorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words) and only a few operations per element scanned. The algorithms are based on statistical observations made on bits of hashed values of records. They are by construction totally insensitive to the replicative structure of elements in the file; they can be used in the context of distributed systems without any degradation of performances and prove especially useful in the context of data bases query optimisation.
0 references
number of distinct elements in a large collection of data
0 references