代码之家 › 专栏 › 技术社区 › jason

对无符号长序列中的公共位进行计数

bit-manipulation c#

jason · 技术社区 · 16 年前

我正在寻找一种比以下算法更快的算法。给定一个64位无符号整数序列,返回序列中64位中每一位被设置的次数。

例子:

4608 = 0000000000000000000000000000000000000000000000000001001000000000 
4097 = 0000000000000000000000000000000000000000000000000001000000000001
2048 = 0000000000000000000000000000000000000000000000000000100000000000

counts 0000000000000000000000000000000000000000000000000002101000000001

例子:

2560 = 0000000000000000000000000000000000000000000000000000101000000000
530  = 0000000000000000000000000000000000000000000000000000001000010010
512  = 0000000000000000000000000000000000000000000000000000001000000000

counts 0000000000000000000000000000000000000000000000000000103000010010

static int bits = sizeof(ulong) * 8;

public static int[] CommonBits(params ulong[] values) {
    int[] counts = new int[bits];
    int length = values.Length;

    for (int i = 0; i < length; i++) {
        ulong value = values[i];
        for (int j = 0; j < bits && value != 0; j++, value = value >> 1) {
            counts[j] += (int)(value & 1UL);
        }
    }

    return counts;
}

8 回复 | 直到 16 年前

Joel 16 年前

values.Length

Noon Silk 16 年前

Bit Twiddling Hacks

csharptest.net 16 年前

for (int i = 0; i < length; i++)
{
    ulong value = values[i];
    if (0ul != (value & 1ul)) counts[0]++;
    if (0ul != (value & 2ul)) counts[1]++;
    if (0ul != (value & 4ul)) counts[2]++;
    //etc...
    if (0ul != (value & 4611686018427387904ul)) counts[62]++;
    if (0ul != (value & 9223372036854775808ul)) counts[63]++;
}

这是我能做的最好的了……根据我的评论,在32位环境中运行这个程序会浪费一些时间(我不知道有多少)。如果你担心性能首先将数据转换为uint对您有益。

棘手的问题。..甚至可以将其封送为C++,但这完全取决于您的应用程序。很抱歉,我无法提供更多帮助,也许其他人会看到我错过的东西。

我试过了。

Luka Rahne 16 年前

(使用查找表进行翻译)

您必须进行8*(64位整数的数量)求和

//LOOKTABLE IS EXTERNAL and has is int64[256] ;
unsigned char* bitcounts(int64* int64array,int len)
{  
    int64* array64;
    int64 tmp;
    unsigned char* inputchararray;
    array64=(int64*)malloc(64);
    inputchararray=(unsigned char*)input64array;
    for(int i=0;i<8;i++) array64[i]=0; //set to 0

    for(int j=0;j<len;j++)
    {             
         tmp=int64array[j];
         for(int i=7;tmp;i--)
         {
             array64[i]+=LOOKUPTABLE[tmp&0xFF];
             tmp=tmp>>8;
         }
    }
    return (unsigned char*)array64;
}

与单纯的实现相比,这种redcuce速度提高了8倍,因为它每次可以实现8位。

这只适用于最多256个输入,因为它使用无符号char来存储数据。如果你有更长的输入字符串,你可以更改此代码以容纳最多2^16个比特计数,并将spped减少2

Jay 16 年前

const unsigned int BYTESPERVALUE = 64 / 8;
unsigned int bcount[BYTESPERVALUE][256];
memset(bcount, 0, sizeof bcount);
for (int i = values.length; --i >= 0; )
  for (int j = BYTESPERVALUE ; --j >= 0; ) {
    const unsigned int jth_byte = (values[i] >> (j * 8)) & 0xff;
    bcount[j][jth_byte]++; // count byte value (0..255) instances
  }

unsigned int count[64];
memset(count, 0, sizeof count);
for (int i = BYTESPERVALUE; --i >= 0; )
  for (int j = 256; --j >= 0; ) // check each byte value instance
    for (int k = 8; --k >= 0; ) // for each bit in a given byte
      if (j & (1 << k)) // if bit was set, then add its count
        count[i * 8 + k] += bcount[i][j];

EvilTeach 15 年前

另一种可能有利可图的方法是构建256个元素的阵列, 它对递增计数数组时需要采取的操作进行编码。

这是一个4元素表的示例,它做2位而不是8位。

int bitToSubscript[4][3] =
{
    {0},       // No Bits set
    {1,0},     // Bit 0 set
    {1,1},     // Bit 1 set
    {2,0,1}    // Bit 0 and bit 1 set.
}

然后,算法退化为:

从数字中挑出2个右手位。
在该数组中,取出第一个整数。这是计数数组中需要递增的元素数量。
循环完成后,将原始数字向右移动两位。…冲洗根据需要重复。

基于这个小例子,应该可以扩展到您想要的大小。我认为可以使用另一个程序来生成bitToSubscript数组的源代码,这样就可以在程序中简单地对其进行硬编码。

Ben Voigt 14 年前

  const ulong mask = 0x1111111111111111;
  public static int[] CommonBits(params ulong[] values)
  {
    int[] counts = new int[64];

    ulong accum0 = 0, accum1 = 0, accum2 = 0, accum3 = 0;

    int i = 0;
    foreach( ulong v in values ) {
      if (i == 15) {
        for( int j = 0; j < 64; j += 4 ) {
          counts[j]   += ((int)accum0) & 15;
          counts[j+1] += ((int)accum1) & 15;
          counts[j+2] += ((int)accum2) & 15;
          counts[j+3] += ((int)accum3) & 15;
          accum0 >>= 4;
          accum1 >>= 4;
          accum2 >>= 4;
          accum3 >>= 4;
        }
        i = 0;
      }

      accum0 += (v)      & mask;
      accum1 += (v >> 1) & mask;
      accum2 += (v >> 2) & mask;
      accum3 += (v >> 3) & mask;
      i++;
    }

    for( int j = 0; j < 64; j += 4 ) {
      counts[j]   += ((int)accum0) & 15;
      counts[j+1] += ((int)accum1) & 15;
      counts[j+2] += ((int)accum2) & 15;
      counts[j+3] += ((int)accum3) & 15;
      accum0 >>= 4;
      accum1 >>= 4;
      accum2 >>= 4;
      accum3 >>= 4;
    }

    return counts;
  }

http://ideone.com/eNn4O (需要更多的测试用例)

-1

Luka Rahne 16 年前

http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetNaive

其中之一

unsigned int v; // count the number of bits set in v
unsigned int c; // c accumulates the total bits set in v
for (c = 0; v; c++)
{
  v &= v - 1; // clear the least significant bit set
}

请记住,此方法的复杂性约为O(log2(n)),其中n是计数位的数字,因此对于10个二进制文件,它只需要2个循环

你可能应该采用用64位算术计算32位并将其应用于单词的每一半的方法,即2*15+4条指令

// option 3, for at most 32-bit values in v:
c =  ((v & 0xfff) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f;
c += (((v & 0xfff000) >> 12) * 0x1001001001001ULL & 0x84210842108421ULL) % 
   0x1f;
c += ((v >> 24) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f;

如果你有支持sse4,3的处理器,你可以使用POPCNT指令。 http://en.wikipedia.org/wiki/SSE4