Various ways of using pattern matching for family characterization
A sequence belongs to the family if
1. it matches the given sequence pattern;
2. if it is within a certain distance from a string that matches a the pattern (distance between strings can be defined either as a number of mismatches, or as an edit-distance, or based on similarity matrices or some other way) ;
3. if it matches one of a given set of patterns (i.e.,if it matches a union of patterns);
4. if a decision-tree over the matching patterns returns “yes”