2014 Poster Sessions : Counting Small Cliques in Social Networks via Triangle-Preserving Decompositions

Student Name : Joshua Wang
Advisor : Tim Roughgarden
Research Areas: Theory
A clique of size r is a complete graph on r vertices, and represents the most cohesive subgroup of that size. Counting the number of small cliques is a fundamental operation in graph analysis. These numbers are an important part of graphlet analysis in bioinformatics, subgraph frequency counts in social network analysis, and are input to graph models (like exponential random graph models). The simplest non-trivial incarnation of this problem is triangle counting, with r=3. Triangle counting has a long and rich history in data mining research, but there has been little work done for larger values of r.

We provide one of the first scalable algorithms for counting r-cliques in social networks. Our algorithm is based on a recent theoretical result on triangle-preserving decompositions for social networks. We implement a novel decomposition procedure that partitions a graph into dense, triangle-rich subgraphs. We exploit these special properties by approximately counting small cliques in these subgraphs using sampling methods.

Our algorithm scales to millions of edges and approximates r-clique counts for r up to 9. It runs orders of magnitude faster than enumeration schemes. For example, for 8-clique counts, enumeration takes more than a day, while our algorithm terminates in minutes. The output is accurate, with error within 10% for almost all instances, and largely within 5%.

This is joint work with Rishi Gupta, Tim Roughgarden, and C. Seshadhri.

Joshua Wang is a first year PhD student who rotated with Tim Roughgarden during fall quarter. Prior to that, Joshua received his B.S. in computer science from Stanford.