Sequential Covering Rule Building

What is Sequential Covering Rule Building?

Sequential covering is an unsupervised machine learning technique that classifies data by discovering and applying one characteristic (rule) at a time to the dataset. This is accomplished by learning one common feature, or rule, at a time, classifying every datapoint matching that rule and remove it from the sample, then discovering a new pattern/common attribute rule for the remaining unlabeled data. The process repeats until all data is labeled, with every new rule generated representing a new classification, or “cover hypothesis.” 

Why use Sequential Covering?

For a model to be useful for humans, there must be some rules for how to interpret the results. One of the most common options for selecting rules is to use a decision tree, where each branch in the tree represents a rule. This requires repeatedly splitting the dataset into ever smaller regions, depending on the most promising (highest weighted) attribute at any point.

Sequential covering, however, doesn’t split the data, but applies the most promising rule (pattern) it finds, removes all the matches, and then discovers and applies the next most promising rule to the remaining data over and over.