Home » Data compression, Featured, Headline

Random data compression – Risk based selection

3 October 2010 by Keshav Shetty 963 views One Comment

In my previous article I mentioned that once tree structure is ready we can fill and generate the random input along with one bit diff encoding.

As I observed as tree reaches mid levels, available options increases to 64+ elements, which will result into minimum 6bits per selection.

With respect to million random digit this approach will result into not more than 27 duplicates can be accommodated.

We need a better approach to selecting elements instead of one bit dif encoding.

I used the modified version of one bit dif encoding, I call it as risk based selection.

Lets assume that I have possible available selection (4, 9, 40, 102, 150, 165, 172, 197, 245) and actual selection was 245. Using one bit dif encoding I need 3 bit i.e “111″ to select 245 among these input.

In risk based selection I will try to make two list with maximum numbers in one list and remaining in another list based on particular bit.

If you analyze below image you will understand that 4th bit in available option results into 7 zeros, If I knew 4th bit of number which I am selecting is “1″ then I will left with only two remaining numbers.

i.e in available number 4th bit has maximum variance. (As I am aware there is other 6th bit as 1 as maximum or 7th bit as maximum zero, lets select first possibility)

The number which I am going to select i.e 245 has 4th bit as “1″, If I store this bit my next available options reduces to two i.e 150, 245.

riskbasedSelection.png

As you can see in the image red marked bit are selected for next round. I repeat same procedure for remaining options until I am left with only one option.

Green box indicates 2nd level filtering. After second bit selection I will left with 245 which is the number I am looking for. This way I need only two bit (i.e “11″) to select particular node.

This selection is more effective than bit dif encoding which requires 3 bit compared to 2 bit.

Note: In worst case scenario we may end up with 8bits for each node. You can see yourself by applying this for below example

Assume we have these available option (1,2,4,8,16,32,64,128) and number to be selected is 1. We will end up with 8 bits.

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)
Loading ... Loading ...

One Comment »

  • Khan said:

    Nice article Shetty.

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.