Mining Contrast Sets

1. Introduction

  • Motivation
    • understanding differences between groups
  • Task
    • provide an efficient algorithm for mining contrast contrast sets and pruning rules to reduce complexity
    • provide post processing techniques to present subsets that are surprising
    • control the false positives
    • be statistically sound
  • Goal
    • To find contrast-sets whose support differs meaningfully (statistically) across groups
      • exists i,j  P(cset = true | G_i) neq P(cset = true | G_j), max_{ij} | sup(cset, G_i) - sup(cset, G_j)| geq min dev
2. Naive Approach
  • Add an attribute to the set (group type) and use Association Rule Mining to find the differences
  • Problems
    • this will not return group differences
    • the results will be difficult to interpret

Leave a Reply