Obscuring Numbers Generated by Auto Incrementing Primary Keys


When you build a website it is often filled with objects that use serial column types, these are usually auto-incrementing integers. Often you want to obscure these numbers since they may convey some business value i.e. number of sales, users reviews etc. A Better idea is to use an actual Natural Key, which exposes the actual domain name of the object vs some numeric identifier. It's not always possible to produce a natural key for every object and when you can't do this, consider obscuring the serial id.

This doesn't secure your numbers that convey business value, it only conceals them from the casual observer. Here is an alternative that uses the bit mixing properties of exclusive or XOR, some compression by conversion into "base36" (via reddit) and some bit shuffling so that the least significant bit is moved which minimizes the serial appearance. You should be able to adapt this code to alternative bit sizes and shuffling patterns with some small changes. Just not that I am using signed integers and it is important to keep the high bit 0 to avoid negative numbers that cannot be converted via the "base36" algorithm.

Twiddling bits in python isn't fun so I used the excellent bitstring module

    from bitstring import Bits, BitArray

    #set the mask to whatever you want, just keep the high bit 0 (or use bitstring's uint)
    XOR_MASK = Bits(int=0x71234567, length=32)

    # base36 the reddit way 
    # https://github.com/reddit/reddit/blob/master/r2/r2/lib/utils/_utils.pyx
    # happens to be easy to convert back to and int using int('foo', 36)
    # int with base conversion is case insensitive
     def to_base(q, alphabet):
        if q < 0: raise ValueError, "must supply a positive integer"
        l = len(alphabet)
        converted = []
        while q != 0:
            q, r = divmod(q, l)
            converted.insert(0, alphabet[r])
        return "".join(converted) or '0'

    def to36(q):
        return to_base(q, '0123456789abcdefghijklmnopqrstuvwxyz')

    def shuffle(ba, start=(1,16,8), end=(16,32,16), reverse=False):
        flip some bits around
        '0x10101010' -> '0x04200808'
        b = BitArray(ba)
        if reverse:
            map(b.reverse, reversed(start), reversed(end))
            map(b.reverse, start, end)
        return b  

    def encode(num):
        Encodes numbers to strings

        >>> encode(1)

        >>> encode(2)
        return to36((shuffle(BitArray(int=num,length=32)) ^ XOR_MASK).int)

    def decode(q):
        decodes strings to  (case insensitive)

        >>> decode('ve3b6v')

        >>> decode('Ve3b6V')

        return (shuffle(BitArray(int=int(q,36),length=32) ^ XOR_MASK, reverse=True) ).int