Probabilistic counting algorithms for data base applications (Q1069325): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
CorrectionBot (talk | contribs)
Changed label, description and/or aliases in en, and other parts
 
(3 intermediate revisions by 3 users not shown)
description / endescription / en
scientific article
scientific article; zbMATH DE number 3934444
Property / OpenAlex ID
 
Property / OpenAlex ID: W2025051251 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5799604 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate counting: a detailed analysis / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4057549 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Counting large numbers of events in small registers / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sorting and Searching in Multisets / rank
 
Normal rank
Property / DBLP publication ID
 
Property / DBLP publication ID: journals/jcss/FlajoletM85 / rank
 
Normal rank

Latest revision as of 11:40, 15 July 2025

scientific article; zbMATH DE number 3934444
Language Label Description Also known as
English
Probabilistic counting algorithms for data base applications
scientific article; zbMATH DE number 3934444

    Statements

    Probabilistic counting algorithms for data base applications (English)
    0 references
    0 references
    0 references
    1985
    0 references
    This paper introduces a class of probabilistic counting algorithms with which one can estimate the number of distinct elements in a large collection of data (typically a large file stored on disk) in a single pass using only a small additional storage (typically less than a hundred binary words) and only a few operations per element scanned. The algorithms are based on statistical observations made on bits of hashed values of records. They are by construction totally insensitive to the replicative structure of elements in the file; they can be used in the context of distributed systems without any degradation of performances and prove especially useful in the context of data bases query optimisation.
    0 references
    number of distinct elements in a large collection of data
    0 references

    Identifiers