Wednesday, August 19, 2015

the power of selective indexes

Today I was working for a client who was having a bit of a performance issue.  He has a table that looks like this:


Id Source Type Entry ParentID
1 FBI Name Charles Smith NULL
2 FBI Alias Chuck Smith 1
3 FBI Eye Color Brown 1
4 Interpol Name Richard Jones NULL
4Interpol Alias Dick Jones 4
6 Interpol Nationality UK 4


And so on, for about 12MM rows. (These are "wanted" lists -- and we get different attributes on a per-list or per-entry basis.) So what we want to do is let someone search by name, in which case we want to match any name or alias. We have a lot of lists, so we end up parallelizing the query like this:

select *
from watch_list
where source = 'FBI'
and similarity(name, 'Robert Cooper') > 0.5


But we were not getting the performance that we needed out of it that we needed. Enter "selective indexes", where we can limit the number of rows in the index, thus limiting the number of rows we have to search:

create index name_and_alias on watch_list (entry) where type in ('Name','Alias');


And boom!, we get the performance we need.

No comments: