gzip公司
压缩文件使用
基于gzip
文件IO在C中运行。我随身携带的压缩文件相当大,大小为12 GB。未压缩的文件是
约260 GB
因此,我不准备使用gunzip解压文件并从那里继续。
我专门使用下面的代码来读写我们可用的缓冲区-
#define windowBits 15
#define ENABLE_ZLIB_GZIP 32
#define CHUNK 0x4000
#define CALL_ZLIB(x) { \
int status; \
status = x; \
if (status < 0) \
{ \
fprintf(stderr, "%s:%d: %s returned a bad status of %d.\n", __FILE__, __LINE__, #x, status); \
exit(EXIT_FAILURE);\
} \
} \
int main ()
{
const char * file_name = "test.gz";
FILE * file;
z_stream strm = {0};
unsigned char in[CHUNK];
unsigned char out[CHUNK];
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.opaque = Z_NULL;
strm.next_in = in;
strm.avail_in = 0;
CALL_ZLIB (inflateInit2 (& strm, windowBits | ENABLE_ZLIB_GZIP));
/* Open the file. */
file = fopen (file_name, "rb");
while (1) {
int bytes_read;
bytes_read = fread (in, sizeof (char), sizeof (in), file);
strm.avail_in = bytes_read;
do {
unsigned have;
strm.avail_out = CHUNK;
strm.next_out = out;
CALL_ZLIB (inflate (& strm, Z_NO_FLUSH));
have = CHUNK - strm.avail_out;
fwrite (out, sizeof (unsigned char), have, stdout);
}
while (strm.avail_out == 0);
if (feof (file)) {
inflateEnd (& strm);
break;
}
}
return 0;
}
代码根据您最初指定的缓冲区准确地读取和写入zlib文件。缓冲区大小固定为某个值(在上述情况下为
0x4000
).
现在的问题是,我不能将这个缓冲区的大小增加到超过某个值(
我可以使用3276008作为缓冲区大小,但不能使用32760008
). 要读取12 GB的压缩值,需要使用非常大的缓冲区。正如我在编辑中指定的,这看起来像某种
DATA_ERROR
不是一个
BUFFER
错误所以这毕竟不是缓冲区错误!
有什么方法可以让我使用
zlib
上面的功能?
编辑#1
函数返回的错误代码
inflate
由
CALL_ZLIB
我很抱歉没有包括的功能。因此,当我以0x4000的缓冲区大小运行时,得到以下错误代码。我在代码中添加了CALL_ZLIB函数,供您参考。
错误消息:
parser.c:96: inflate(&strm, Z_NO_FLUSH) returned a bad status of -3
. 这显然看起来像**DATA\u错误。
编辑#2
我尝试添加了一个
但这并没有解决我的任何问题。函数的作用是:首先正确读取我的文件,以我希望的方式显示我的所有数据。。
0x55b0 [0x40]: event: 3
.
. ... raw event: size 64 bytes
. 0000: 03 00 00 00 00 00 40 00 18 03 00 00 18 03 00 00 ......@.........
. 0010: 4d 6f 64 65 6d 4d 61 6e 61 67 65 72 00 00 00 00 ModemManager....
. 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0 0 0x55b0 [0x40]: PERF_RECORD_COMM: ModemManager:792/792
0x55f0 [0x40]: event: 7
.
. ... raw event: size 64 bytes
. 0000: 07 00 00 00 00 00 40 00 19 03 00 00 01 00 00 00 ......@.........
. 0010: 19 03 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
. 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0 0 0x55f0 [0x40]: PERF_RECORD_FORK(793:793):(1:1)
0x5630 [0x40]: event: 3
.
但过了一段时间,显示的输出变得杂乱无章,我无法再从中读取。。
0x4d68 [0x38]: ........... 001 0..
0 0 00 00 00 0 00 000 00 ze 64s
. 0000: 07 00 00 00 00 00 40 00 19 03 00 00 01 00 00 00 .. 00 0 event: size 64 bytes
. 0000: 03 00 00 00 si sisizsiz4s
. 0000: 07 00 00 00 00 00 40 00 19 0....
. 0030: 00 00 00 00 00 00 00 00 00 00 00 00 ..@.@. 0010: 19 03 00 00 [0x38]: ........... 001 0..
0 0 00 00 00 0 00 000 00 ze 64s
. 0000: 07 00 00 00 00 00 40 00 100 00 00 00 00 ..............0 0 0x4d28 [0x40]: PERF_RECORD_FORK(135:135):(2:62)
0x4d68 [0x38]: ........... 001 0..
0 0 00 00 00 0 00 000 00 00 00 00: PERORD_FORK(135:135):(2:2)
这最终以我在编辑#1中描述的错误消息结束