To explain, let's use the data in this table and assume that the minimum support is 2.
single item support table
Next, we take all of the items that meet the support requirements, everything so far in this example, an make all of the patterns/combinations we can out of them; AB, AC, AD, AE, BC, BD, BE, CD, CE, DE. When we list all of these combinations in a table, and determine the support for each, we get a table that looks like this.
2 items support before filtering
Several of these patterns don't meet the support threshold of 2, so we remove them from the list of options.
2 item support table
Before I create the support table for these let's look at these patterns. The first one, ABC, was created by combining AB and BC. If you look in the 2 item support table (before or after filtering), you'll find that AC doesn't have the minimum support required. If AC isn't supported, a more complicated pattern that includes AC (ABC) can't be supported either. This is a key point of the Apriori Principle. So, without having to go back to the original data, we can exclude some of the 3-item patterns. When we do this, we eliminate ABC (AC not supported), ABD (AD not supported), ABE (AE not supported), BCE (CE not supported) and BDE (DE not supported). This process of removing patterns that can't be supported because their subsets (or shorter combination) aren't supported is called pruning. This pruning process leaves only BCD with a support of 2.
3 item support table
The final list of all of the patterns that have support greater than or equal to 2 are summarized here.
- The Apriori Principle can be used to simplify the pattern generation process when mining patterns in data sets
- If a simple pattern is not supported, then a more complicated one with that simple pattern in it can not be supported (e.g. if AC isn't supported, there is no way that ABC is supported)
- You can also look at takeaway 2 in the opposite direction. If a complicated pattern meets the minimum support requirements, all of the simpler patterns that can be created from that complicated pattern must be supported. (e.g. if ABC is supported, then AB, BC, AC, A, B, and C all have to be supported)