The stability of long action chains in XCS (Q1864379)

XCS represents a new form of learning classifier system that uses accuracy as a means of guiding fitness for selection within a Genetic Algorithm. The combination of accuracy-based selection and a dynamic niche-based deletion mechanism achieve a long sought-after goal -- the reliable production, maintenance, and proliferation of the sub-population of optimally general accurate classifiers that map the problem domain. Wilson and Lanzi have demonstrated the applicability of XCS to the identification of the optimal action-chain leading to the optimum trade-off between reward distance and magnitude. However, Lanzi also demonstrated that XCS has difficulty in finding an optimal solution to the long action-chain environment Woods-14. Whilst these findings have shed some light on the ability of XCS to form long action-chains, they have not provided a systematic and, above all, controlled investigation of the limits of XCS learning within multiple-step environments. In this investigation a set of confounding variables in such problems are identified. These are controlled using carefully constructed FSW environments of increasing length. Whilst investigations demonstrate that XCS is able to establish the optimal sub-population [O] when generalisation is not used, it is shown that the introduction of generalisation introduces low bounds on the length of action-chains that can be identified and chosen between to find the optimal pathway. Where these bounds are reached a form of over-generalisation caused by the formation of dominant classifiers can occur. This form is further investigated and the Domination Hypothesis introduced to explain its formation and preservation.

0 references

zbMATH Keywords

learning classifier system

0 references

MaRDI profile type

Publication

0 references

full work available at URL

https://doi.org/10.1007/s005000100115

0 references