A factor oracle is a finite-state automaton that can efficiently search for factors (substrings) in a body of text. Older techniques, such as suffix trees, were time-efficient but required significant amounts of memory. Factor oracles, by contrast, can be constructed in linear time and space in an incremental fashion.
Overview
Older techniques for matching strings include: suffix arrays, suffix trees, suffix automata or directed acyclic word graphs, and factor automata (Allauzen, Crochemore, Raffinot, 1999). In 1999, Allauzen, Crochemore, and Raffinot, presented the factor oracle algorithm as a memory efficient improvement upon these older techniques for string matching and compression. Starting in the mid-2000s, factor oracles have found application in computer music, as well.
Implementations
The Computer Audition Laboratory provides a Matlab implementation of the factor oracle algorithm.
See also
References
- Allauzen C., Crochemore M., Raffinot M., Factor oracle: a new structure for pattern matching; Proceedings of SOFSEM’99; Theory and Practice of Informatics.
- Assayag G., Dubnov S., Using Factor Oracles for Machine Improvisation. Soft Computing - A Fusion of Foundations, Methodologies and Applications. 2004-09-01. Springer Berlin / Heidelberg