Developing multi-database mining applications (Q1049887)

In the first chapter, the authors introduce the concept of multi-database mining and present some motivating ideas and research directions. The introduction is followed by two chapters presenting approaches that allow synthesizing global patterns, i.e., patterns that are valid for the union of all data, based on the mining results from the individual databases, the so-called local patterns. In the fourth chapter, the question of how to mine global rules for given items, so-called select items, is investigated. After explaining those basic concepts, the remaining chapters are supposed to present various ways of increasing the quality of the derived knowledge. The authors start by presenting approaches for compressing association rules in order to store more rules in main memory. After that, the identification of similar databases is discussed in order to cluster the data sources. In the last chapter, the authors turn to briefly highlighting the critical points of designing multi-database mining applications in practice. Unfortunately, the resulting book appears more like a selection of loosely connected research papers whose chapters lack good integration. For example, very basic concepts such as support and confidence that were introduced in the second chapter are introduced all over again in Chapter 6. Also, the different topics are not motivated very well. For instance, it is unclear why compressing the local patterns can increase not only the efficiency of the calculation (which is obviously true) but also the quality of knowledge as claimed in Chapter 5. As a result, there is no clear train of thought in the book and no clear messages to the reader. The overall appearance of the book does not convey high standards of quality either. Figures are low-quality pixel graphics bar none. There are frequent errors in spelling and grammar as well as frequent repetitions of phrases, so the book is not pleasing to the reader. What is worse, there is even at least one factual error: The definition of support and confidence in Chapter 2 is wrong because a set intersection is used instead of a set union. The blurb claims that the authors ``discuss the essential issues relating to the systematic and efficient development of multi-database mining applications, and present approaches to the development of data warehouses at different branches, demonstrating how carefully selected multi-database mining techniques contribute to successful real-world applications''. In my opinion, this claim is unjustified. The authors do present a set of multi-database mining approaches but fail to put them in context, let alone show any real-world applications. Even the last chapter, which is supposed to be dedicated to exactly this point, falls short of presenting any useful contents for applying the theoretical knowledge to practical problems. All in all, the book remains largely theoretical. Thus, it might be useful to readers interested in the theory of the topics presented in the individual chapters. However, practitioners occupied with handling actual multi-database mining scenarios will most likely not find it a useful read.

0 references

reviewed by

Gottfried Vossen

0 references

zbMATH Keywords

data mining

0 references

multi-database mining

0 references

distributed data mining

0 references

association rules