mercredi 28 mai 2014

Indexing keys and values in MapDB

MapDB is a high performance pure java database, it provides concurrent collections (Maps, Sets and Queues) backed by disk storage or off-heap memory.
It provides a powerful mechanism to synchronize collections that can be used to build multiple indexes on a primary collection. Follows is an example showing how to index keys and also values of main collection.

1. define a serializable class
// this class should implement serializable in order to be stored
public class Person implements Serializable {
  String firstname; 
  String lastname; 
  Integer age; 
  boolean male;

  public Person(String f, String l, Integer a, boolean m) {     
    this.firstname = f;
    this.lastname = l; 
    this.age = a; 
    this.male = m;
  }

  public boolean isMale() {
    return male;
  }
  @Override
  public String toString() {
    return "Person [firstname=" + firstname + ", lastname=" + lastname + ", age=" + age + ", male=" + male + "]";
  }
}

2. Define a map of persons by id
// stores person under id
BTreeMap<Integer, Person> primary = DBMaker.newTempTreeMap();
primary.put(111, new Person("bIs9r", "NWmqoxFf", 92, true)); 
primary.put(111, new Person("4KXp8", "QrPsabf1", 31, false)); 
primary.put(111, new Person("eJLIo", "SJwJidWk", 6, true)); 
primary.put(111, new Person("LGW58", "vteM4khp", 42, false)); 
primary.put(111, new Person("tIM8R", "Rzq75ONh", 57, false)); 
primary.put(111, new Person("KqKRE", "BnpUV4dW", 26, true)); 

3. Define a gender-based index
// stores value hash from primary map
NavigableSet<Fun.Tuple2<Boolean, Integer>> genderIndex = new TreeSet<Fun.Tuple2<Boolean, Integer>>();

//1. gender-based index: bind secondary to primary so it contains secondary key
Bind.secondaryKey(primary, genderIndex, new Fun.Function2<Boolean, Integer, Person>() {
  @Override
  public Boolean run(Integer key, Person value) {
    return Boolean.valueOf(value.isMale());
  }
});
4. Use the gender-index to read all male persons
Iterable<Integer> ids = Fun.filter(genderIndex, true);
for(Integer id: ids) {
 System.out.println(primary.get(id));
}

MapdDB offers multiple ways to define indexes on a given collection, It can also be extended to define specific kind of indexes. Follows is an example of implementing the Bitmap index in MapDB:
public static <K, V, K2> void secondaryKey(MapWithModificationListener<K, V> map, final Map<K2, Set<K>> secondary,
      final Fun.Function2<K2, K, V> fun) {
  // fill if empty
  if (secondary.isEmpty()) {
    for (Map.Entry<K, V> e : map.entrySet()) {
      K2 k2 = fun.run(e.getKey(), e.getValue());
      Set<K> set = secondary.get(k2);
      if (set == null) {
        set = new TreeSet<K>();
        secondary.put(k2, set);
      }
      set.add(e.getKey());
    }
  }
  // hook listener
  map.modificationListenerAdd(new MapListener<K, V>() {
    @Override
    public void update(K key, V oldVal, V newVal) {
      if (newVal == null) {
        // removal
        secondary.get(fun.run(key, oldVal)).remove(key);
      } else if (oldVal == null) {
        // insert
        K2 key2 = fun.run(key, newVal);
        Set<K> set = secondary.get(key2);
        if (set == null) {
          set = new TreeSet<K>();
          secondary.put(key2, set);
        }
        set.add(key);
      } else {
        // update, must remove old key and insert new
        K2 oldKey = fun.run(key, oldVal);
        K2 newKey = fun.run(key, newVal);
        if (oldKey == newKey || oldKey.equals(newKey))
          return;
        Set<K> set1 = secondary.get(oldKey);
        if (set1 != null) {
          set1.remove(key);
        }
        Set<K> set2 = secondary.get(newKey);
        if (set2 == null) {
          set2 = new TreeSet<K>();
          secondary.put(newKey, set2);
        }
        set2.add(key);
      }
    }
  });
}
This new index can be used as follows:
final Map<Boolean, Set<Integer>> bitmapIndex = new HashMap<Boolean, Set<Integer>>();
secondaryKey(primary, bitmapIndex, fun);

Continue here

Aucun commentaire:

Enregistrer un commentaire