Monday, August 10, 2009

[TIP] Generic Key Comparer

I'm back to code on my "Cloud" FileSystem, and distributed tools. And here is a little tip.

When you're working on Data, probably you store it as a Key-Value Pair on a BTree or something similar, and maybe this key is an aggregation of information... Maybe you've one bit of flag, N bytes of ParentKey, and others...

Now, the problem is... How can a "foreign" server sort correctly my keys? The solution is to send to the server the information on how to sort.. or a method to do it... but today, I'm focusing on the first one.



The code below show an implementation in Python of the Generic Key Comparer. At the end of the source code you can find an usage example. The Full Source Code is available here. Generic Key Comparer Source Code.



def __indexOfOne(data, tokens, offset):
  for i in xrange(offset, len(data)):
    if data[i] in tokens:
      return i
  return -1

def rawComparer(data1, data2, comparer):
  typeIds = [ 's', 'u', 'c', 'i', 'f', 'x' ]
  pyBinMap = { 
    ('u', 1): 'B', ('u', 2): 'H', ('u', 4):'L', ('u', 8):'Q',
    ('i', 1): 'b', ('i', 2): 'h', ('i', 4):'l', ('i', 8):'q',
    ('f', 4): 'f', ('f', 8): 'd'
  }

  p = i = 0
  while i < len(comparer):
  nextIdx = __indexOfOne(comparer, typeIds, i + 1)
  if (nextIdx < 0): nextIdx = len(comparer)

    format = None
    length = 1 if (i + 1) == nextIdx else int(comparer[i+1:nextIdx])      
    if comparer[i] == 's':
      format = str(length) + 's'
    elif comparer[i] == 'c':
      format = 'c'
    elif (comparer[i], length) in pyBinMap:
      format = pyBinMap[(comparer[i], length)]         

    if format != None:
      d1 = struct.unpack(format, data1[p:p+length])[0]
      d2 = struct.unpack(format, data2[p:p+length])[0]

     if d1 < d2:
       return -1
     elif d1 > d2:
       return 1 
     p += length
     i = nextIdx
     return 0

# Usage Example
if __name__ == '__main__':
  data1 = struct.pack('4sLch', 'test', 10, 'A', -3)
  data2 = struct.pack('4sLch', 'test', 10, 'A', -3)
  print 'Equal (test 10 A -3)', rawComparer(data1, data2, 's4u4ci2')

  data1 = struct.pack('4sLch', 'test', 10, 'A', 1)
  data2 = struct.pack('4sLch', 'test', 10, 'A', 0)
  print '(test 10 A 1) > (test 10 A 0)', rawComparer(data1, data2, 's4u4ci2')

  data1 = struct.pack('4sLch', 'test', 10, 'A', 0)
  data2 = struct.pack('4sLch', 'test', 10, 'A', 1)
  print '(test 10 A 0) < (test 10 A 1)', rawComparer(data1, data2, 's4u4ci2')

No comments:

Post a Comment