2009 Poster Sessions : Similarity Search and Locality Sensitive Hashing using Ternary Content Addressable Memories

Student Name : Rajaendra Shinde
Advisor : Ashish Goel
Research Areas: Information Systems
In this work we present a new technique of solving the (1, c)-Near Neighbor problem in Euclidean space using a Ternary Content Addressable Memory (TCAM). Existing techniques based on Locality Sensitive Hashing (LSH) suffer from a tradeoff in query time and space requirement. In particular, it is not possible to design an algorithm which has both near linear space requirement ˜O(n) and query time O(1). In this paper we introduce the concept of Ternary Locality Sensitive Hashing (TLSH). A ternary hash function hashes any point in Rd to {0, 1,*} where * is a wild card that matches both 0 and 1. By using the added functionality of a TLSH scheme with respect to the * character, we construct a TLSH family with performance substantially better than LSH. Using the TLSH family, we devise an algorithm which solves the (1, c)-Near Neighbor problem with a near linear space requirement ˜O(n), and only one TCAM access. Using this setup, it is possible to solve the c-approximate nearest neighbor problem using a single TCAM lookup.

I am a 3rd year PhD student in ICME, Stanford University. My advisor is Ashish Goel. I am currently working on similarity search in large dimensional databases using hardware assisted techniques. I have completed my undergraduation from IIT Bombay in Engineering Physics.