代码之家 › 专栏 › 技术社区 › ComputerScientist

将PyTables/HDF5文件中的所有数组从float64转换为float32

h5py pytables hdf5 numpy arrays

0

ComputerScientist · 技术社区 · 8 年前

我有一个包含大量子目录的PyTables文件。我有一种遍历表中所有数组数据类型的方法。它们是浮动的64;我想转换文件 在正确的位置

import h5py
import numpy as np

# filehead is a string for a file
with h5py.File(filehead, 'r+') as f:
    # Lots of stuff here ... e.g. `head` is a string

    print("/obsnorm/Standardizer/count {}".format(f[head+'/obsnorm/Standardizer/count']))
    print("count value: {}".format(f[head+'/obsnorm/Standardizer/count'].value))
    f[head+'/obsnorm/Standardizer/count'][...] = (f[head+'/obsnorm/Standardizer/count'].value).astype('float32')
    print("/obsnorm/Standardizer/count {}".format(f[head+'/obsnorm/Standardizer/count']))
    print("count value: {}".format(f[head+'/obsnorm/Standardizer/count'].value))

不幸的是,打印的结果是:

/obsnorm/Standardizer/count <HDF5 dataset "count": shape (), type "<f8">
count value: 512364.0
/obsnorm/Standardizer/count <HDF5 dataset "count": shape (), type "<f8">
count value: 512364.0

换句话说,在赋值之前,计数的类型是f8或float64。强制转换后,类型是浮动64。

1 回复 | 直到 8 年前

1

ComputerScientist 8 年前

正如hpaulj在评论中所建议的那样,我决定简单地重新创建一个重复的HDF5文件,除了生成类型为的数据集之外 f4 (与float32相同),我能够实现我的编码目标。

伪代码如下:

import h5py
import numpy as np

# Open the original file jointly with new file, with `float32` at the end.
with h5py.File(oldfile, 'r') as f, h5py.File(newfile[:-3]+'_float32.h5', 'w') as newf:
    # `head` is some directory structure
    # Create groups to follow the same directory structure
    newf.create_group(head)

    # When it comes time to create a dataset, make the cast here.
    newdata = (f[head+'/name_here'].value).astype('float32')
    newf.create_dataset(head+'/name_here', data=newdata, dtype='f4')

    # Proceed for all other datasets.