代码之家 › 专栏 › 技术社区 › sascha

如何根据IEEE754(ANSI-C)获得双精度字符的上/下机器字?

ieee-754 double floating-point c

sascha · 技术社区 · 15 年前

我想使用的sqrt实现 fdlibm .
这个实现定义(根据endianes)一些宏来访问 双精度的上/下32位 )按照以下方式(这里:只有小endian版本):

#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
#define __HIp(x) *(1+(int*)x)
#define __LOp(x) *(int*)x

flibm的自述文件如下(略短一点)

Each double precision floating-point number must be in IEEE 754 
double format, and that each number can be retrieved as two 32-bit 
integers through the using of pointer bashing as in the example 
below:

Example: let y = 2.0
double fp number y:     2.0
IEEE double format: 0x4000000000000000

Referencing y as two integers:
*(int*)&y,*(1+(int*)&y) =   {0x40000000,0x0} (on sparc)
            {0x0,0x40000000} (on 386)

Note: Four macros are defined in fdlibm.h to handle this kind of
      retrieving:

__HI(x)     the high part of a double x 
        (sign,exponent,the first 21 significant bits)
__LO(x)     the least 32 significant bits of x
__HIp(x)    same as __HI except that the argument is a pointer
        to a double
__LOp(x)    same as __LO except that the argument is a pointer
        to a double

If the behavior of pointer bashing is undefined, one may hack on the 
macro in fdlibm.h.

我想使用这个实现和这些宏 cbmc 模型检查器,应符合ANSI-C .
我不知道到底出了什么问题,但下面的示例显示这些宏不起作用(选择了小endian,选择了32位机器字):

temp=24376533834232348.000000l (0100001101010101101001101001010100000100000000101101110010000111)
high=0                         (00000000000000000000000000000000)
low=67296391                   (00000100000000101101110010000111)

两者似乎都错了。对于温度的每一个值,high似乎都是空的。

有什么新想法可以用ansi-c访问这两个32个单词吗?

更新: 谢谢你的回答和评论。你所有的建议都对我有用。目前,我决定使用“r.”的版本,并将其标记为最喜欢的答案,因为它在我的工具中似乎是关于endianness最强大的。

3 回复 | 直到 14 年前

R.. GitHub STOP HELPING ICE 15 年前

像您这样强制转换指针违反了C语言的别名规则(编译器可能会假定不同类型的指针不指向相同的数据,除非在某些非常有限的情况下)。更好的方法可能是:

#define REP(x) ((union { double v; uint64_t r; }){ x }).r
#define HI(x) (uint32_t)(REP(x) >> 32)
#define LO(x) (uint32_t)(REP(x))

注意,这还修复了endian依赖项(假定浮点和整数endianness相同)和非法 _ -宏名称的前缀。

一个更好的方法可能是根本不将其分成高/低部分,并使用 uint64_t 表现 REP(x) 直接。

从标准的角度来看,使用联合有点可疑,但比指针强制转换要好。使用铸件 unsigned char * 在某些方面,逐字节访问数据会更好,但更糟的是,您必须担心endian的考虑,而且可能会慢很多。

Kos 15 年前

为什么不使用工会?

union {
    double value;
    struct {
        int upper;
        int lower;
    } words;
} converter;

converter.value = 1.2345;
printf("%d",converter.words.upper);

(请注意,行为代码依赖于实现,并且依赖于内部表示和特定的数据大小)

此外,如果使该结构包含位域,则可以分别访问各个浮点部分(符号、指数和尾数):

union {
    double value;
    struct {
        int upper;
        int lower;
    } words;
    struct {
        long long mantissa : 52; // not 2C!
        int exponent : 11;       // not 2C!
        int sign : 1;
    };        
} converter;

Reinderien 15 年前

我建议看一下反汇编,看看现有的“指针抨击”方法为什么不起作用。在缺少二进制移位的情况下,您可能会使用更传统的移位(如果您使用的是64位系统)。