org.xlattice.crypto.filters
Class BloomSHA1

java.lang.Object
  extended by org.xlattice.crypto.filters.BloomSHA1

public class BloomSHA1
extends java.lang.Object

A Bloom filter for sets of SHA1 digests. A Bloom filter uses a set of k hash functions to determine set membership. Each hash function produces a value in the range 0..M-1. The filter is of size M. To add a member to the set, apply each function to the new member and set the corresponding bit in the filter. For M very large relative to k, this will normally set k bits in the filter. To check whether x is a member of the set, apply each of the k hash functions to x and check whether the corresponding bits are set in the filter. If any are not set, x is definitely not a member. If all are set, x may be a member. The probability of error (the false positive rate) is f = (1 - e^(-kN/M))^k, where N is the number of set members. This class takes advantage of the fact that SHA1 digests are good- quality pseudo-random numbers. The k hash functions are the values of distinct sets of bits taken from the 20-byte SHA1 hash. The number of bits in the filter, M, is constrained to be a power of 2; M == 2^m. The number of bits in each hash function may not exceed floor(m/k). This class is designed to be thread-safe, but this has not been exhaustively tested.

Author:
< A HREF="mailto:jddixon@users.sourceforge.net">Jim Dixon BloomSHA1.java and KeySelector.java are BSD licensed from the xlattice app - http://xlattice.sourceforge.net/ minor tweaks by jrandom, exposing unsynchronized access and allowing larger M and K. changes released into the public domain.

Field Summary
protected  int[] bitOffset
           
protected  int count
           
protected  int[] filter
           
protected  int filterBits
           
protected  int filterWords
           
protected  int k
           
protected  KeySelector ks
           
protected  int m
           
protected  int[] wordOffset
           
 
Constructor Summary
BloomSHA1()
          Creates a filter of 2^20 bits with k defaulting to 8.
BloomSHA1(int m)
          Creates a filter of 2^m bits, with the number of 'hash functions" k defaulting to 8.
BloomSHA1(int m, int k)
          Creates a filter with 2^m bits and k 'hash functions', where each hash function is portion of the 160-bit SHA1 hash.
 
Method Summary
static java.lang.String btoh(byte b)
          convert single byte to String
 int capacity()
           
 void clear()
          Synchronized version
protected  void doClear()
          Clear the filter, unsynchronized
 double falsePositives()
           
 double falsePositives(int n)
           
 void insert(byte[] b)
          Add a key to the set represented by the filter.
 void insert(byte[] b, int offset, int len)
           
protected  boolean isMember(byte[] b)
          Is a key in the filter.
protected  boolean isMember(byte[] b, int offset, int len)
           
static java.lang.String itoh(int i)
          convert 32-bit integer to String
static java.lang.String keyToString(byte[] key)
           
 void locked_insert(byte[] b)
           
 void locked_insert(byte[] b, int offset, int len)
           
 boolean locked_member(byte[] b)
           
 boolean locked_member(byte[] b, int offset, int len)
           
static java.lang.String ltoh(long i)
          convert 64-bit integer to hex String
 boolean member(byte[] b)
          Is a key in the filter.
 boolean member(byte[] b, int offset, int len)
           
 int size()
          Returns the number of keys which have been inserted.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

m

protected final int m

k

protected final int k

count

protected int count

filter

protected final int[] filter

ks

protected KeySelector ks

wordOffset

protected final int[] wordOffset

bitOffset

protected final int[] bitOffset

filterBits

protected final int filterBits

filterWords

protected final int filterWords
Constructor Detail

BloomSHA1

public BloomSHA1(int m,
                 int k)
Creates a filter with 2^m bits and k 'hash functions', where each hash function is portion of the 160-bit SHA1 hash.

Parameters:
m - determines number of bits in filter
k - number of hash functionsx See KeySelector for important restriction on max m and k

BloomSHA1

public BloomSHA1(int m)
Creates a filter of 2^m bits, with the number of 'hash functions" k defaulting to 8.

Parameters:
m - determines size of filter

BloomSHA1

public BloomSHA1()
Creates a filter of 2^20 bits with k defaulting to 8.

Method Detail

doClear

protected void doClear()
Clear the filter, unsynchronized


clear

public void clear()
Synchronized version


size

public final int size()
Returns the number of keys which have been inserted. This class (BloomSHA1) does not guarantee uniqueness in any sense; if the same key is added N times, the number of set members reported will increase by N.

Returns:
number of set members

capacity

public final int capacity()
Returns:
number of bits in filter

insert

public void insert(byte[] b)
Add a key to the set represented by the filter. XXX This version does not maintain 4-bit counters, it is not a counting Bloom filter.

Parameters:
b - byte array representing a key (SHA1 digest)

insert

public void insert(byte[] b,
                   int offset,
                   int len)

locked_insert

public final void locked_insert(byte[] b)

locked_insert

public final void locked_insert(byte[] b,
                                int offset,
                                int len)

isMember

protected final boolean isMember(byte[] b)
Is a key in the filter. Sets up the bit and word offset arrays.

Parameters:
b - byte array representing a key (SHA1 digest)
Returns:
true if b is in the filter

isMember

protected final boolean isMember(byte[] b,
                                 int offset,
                                 int len)

locked_member

public final boolean locked_member(byte[] b)

locked_member

public final boolean locked_member(byte[] b,
                                   int offset,
                                   int len)

member

public final boolean member(byte[] b)
Is a key in the filter. External interface, internally synchronized.

Parameters:
b - byte array representing a key (SHA1 digest)
Returns:
true if b is in the filter

member

public final boolean member(byte[] b,
                            int offset,
                            int len)

falsePositives

public final double falsePositives(int n)
Parameters:
n - number of set members
Returns:
approximate false positive rate

falsePositives

public final double falsePositives()

keyToString

public static java.lang.String keyToString(byte[] key)

ltoh

public static java.lang.String ltoh(long i)
convert 64-bit integer to hex String


itoh

public static java.lang.String itoh(int i)
convert 32-bit integer to String


btoh

public static java.lang.String btoh(byte b)
convert single byte to String