Random data compression – Risk based selection
In my previous article I mentioned that once tree structure is ready we can fill and generate the random input along with one bit diff encoding.
As I observed as tree reaches mid levels, available options increases to 64+ elements, which will result into minimum 6bits per selection.
With respect to million random digit this approach will result into not more than 27 duplicates can be accommodated.
We need a better approach to selecting elements instead of one bit dif encoding.
I used the modified version of one bit dif encoding, I call it as risk based selection.
Lets assume that I have possible available selection (4, 9, 40, 102, 150, 165, 172, 197, 245) and actual selection was 245. Using one bit dif encoding I need 3 bit i.e “111″ to select 245 among these input.
In risk based selection I will try to make two list with maximum numbers in one list and remaining in another list based on particular bit.
If you analyze below image you will understand that 4th bit in available option results into 7 zeros, If I knew 4th bit of number which I am selecting is “1″ then I will left with only two remaining numbers.
i.e in available number 4th bit has maximum variance. (As I am aware there is other 6th bit as 1 as maximum or 7th bit as maximum zero, lets select first possibility)
The number which I am going to select i.e 245 has 4th bit as “1″, If I store this bit my next available options reduces to two i.e 150, 245.
As you can see in the image red marked bit are selected for next round. I repeat same procedure for remaining options until I am left with only one option.
Green box indicates 2nd level filtering. After second bit selection I will left with 245 which is the number I am looking for. This way I need only two bit (i.e “11″) to select particular node.
This selection is more effective than bit dif encoding which requires 3 bit compared to 2 bit.
Note: In worst case scenario we may end up with 8bits for each node. You can see yourself by applying this for below example
Assume we have these available option (1,2,4,8,16,32,64,128) and number to be selected is 1. We will end up with 8 bits.