Exploiting data-distribution patterns in modeling tuple selectivities in a database (Q1803861)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Exploiting data-distribution patterns in modeling tuple selectivities in a database |
scientific article; zbMATH DE number 221977
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Exploiting data-distribution patterns in modeling tuple selectivities in a database |
scientific article; zbMATH DE number 221977 |
Statements
Exploiting data-distribution patterns in modeling tuple selectivities in a database (English)
0 references
29 June 1993
0 references
The paper concerns an estimation of the number of tuples satisfying a database query. The model is based on data distribution and is independent of an used database system. Queries are considered over one relation modelled as multidimensional bit map. A partial-range query specifies a hyperrectangle within a bit map. The main idea of the paper is the following: if the 1's in the bit map are distributed not homogeneously, then the map is divided into cells in which the 1's are distributed homogeneously. Much of the paper is devoted to obtaining the best trade-off between storage requirements for bit map representation and accuracy of estimates of the number of 1's is any hyperrectangle. The authors discuss in the first four chapters a number of other approaches related to the problem. In Chapter 5, they define a new notion of homogeneity and compare it with related notions occurred in the literature. Chapter 6 contains a new method based on sampling. It is shown that it requires a sample size of 100 to produce accurate estimates 95\% of the time. The paper is written in detail but not homogeneously. All new ideas are presented rather briefly in comparing to the parts explaining other approaches etc. Any discussion concerning an implementation of the method in a database environment is not presented here.
0 references
tuple selectivity
0 references
query evaluation
0 references
data distribution
0 references
0 references