Available Knowhow

Abstract
Apparatus for automatic segmentation of non-aligned data sequences comprising structural domains to identify and construct models of the structural domains. The apparatus comprises a soft clustering unit, a refinement unit and an annealing unit. The soft clustering unit iteratively partitions the data sequences and trains variable memory Markov sources, created using a prediction suffix tree data structure, on the data until convergence is reached. The clustering unit also eliminates sources showing low relationships with the data. The refinement unit is connected to the soft clustering unit and splits and perturbs the sources following convergence, to repeat the iterative partitioning at the soft clustering unit, thereby to refine the model. The annealing unit increases the resolution with which the relationships between data and sources is shown, thereby governing the way in which less competitive sources are rejected, and the apparatus outputs the surviving variable memory Markov sources to provide models for subsequent identification of the structural domains.
Link to Full Patent Text: http://goo.gl/2Lhbyx