代码之家  ›  专栏  ›  技术社区  ›  ComputerScientist

将PyTables/HDF5文件中的所有数组从float64转换为float32

  •  0
  • ComputerScientist  · 技术社区  · 8 年前

    我有一个包含大量子目录的PyTables文件。我有一种遍历表中所有数组数据类型的方法。它们是浮动的64;我想转换文件 在正确的位置

    According to this question

    import h5py
    import numpy as np
    
    # filehead is a string for a file
    with h5py.File(filehead, 'r+') as f:
        # Lots of stuff here ... e.g. `head` is a string
    
        print("/obsnorm/Standardizer/count {}".format(f[head+'/obsnorm/Standardizer/count']))
        print("count value: {}".format(f[head+'/obsnorm/Standardizer/count'].value))
        f[head+'/obsnorm/Standardizer/count'][...] = (f[head+'/obsnorm/Standardizer/count'].value).astype('float32')
        print("/obsnorm/Standardizer/count {}".format(f[head+'/obsnorm/Standardizer/count']))
        print("count value: {}".format(f[head+'/obsnorm/Standardizer/count'].value))
    

    不幸的是,打印的结果是:

    /obsnorm/Standardizer/count <HDF5 dataset "count": shape (), type "<f8">
    count value: 512364.0
    /obsnorm/Standardizer/count <HDF5 dataset "count": shape (), type "<f8">
    count value: 512364.0
    

    换句话说,在赋值之前,计数的类型是f8或float64。强制转换后,类型是 浮动64。

    1 回复  |  直到 8 年前
        1
  •  1
  •   ComputerScientist    8 年前

    正如hpaulj在评论中所建议的那样,我决定简单地重新创建一个重复的HDF5文件,除了生成类型为的数据集之外 f4 (与float32相同),我能够实现我的编码目标。

    伪代码如下:

    import h5py
    import numpy as np
    
    # Open the original file jointly with new file, with `float32` at the end.
    with h5py.File(oldfile, 'r') as f, h5py.File(newfile[:-3]+'_float32.h5', 'w') as newf:
        # `head` is some directory structure
        # Create groups to follow the same directory structure
        newf.create_group(head)
    
        # When it comes time to create a dataset, make the cast here.
        newdata = (f[head+'/name_here'].value).astype('float32')
        newf.create_dataset(head+'/name_here', data=newdata, dtype='f4')
    
        # Proceed for all other datasets.
    
    推荐文章