代码之家  ›  专栏  ›  技术社区  ›  CodingWithoutComments

如何将xml文件编码为xfdl(base64-gzip)?

  •  2
  • CodingWithoutComments  · 技术社区  · 16 年前

    在阅读其他内容之前,请先阅读 original thread .

    概述:.xfdl文件是一个gzip.xml文件,然后用base64编码。我希望将.xfdl反编码为xml,然后修改它,然后重新编码回.xfdl文件。

    xfdl>xml.gz文件>xml>xml.gz文件>xfdl

    我已经能够获取一个.xfdl文件,并使用uudeview从base64对其进行反编码:

    uudeview -i yourform.xfdl
    

    然后用gunzip把它拆开

    gunzip -S "" < UNKNOWN.001 > yourform-unpacked.xml
    

    gzip yourform-unpacked.xml
    

    然后在base-64中重新编码:

    base64 -e yourform-unpacked.xml.gz yourform_reencoded.xfdl
    

    如果我的想法是正确的,原始文件和重新编码的文件应该相等。如果我把你的表单.xfdl你的身体呢_重新编码.xfdl然而,他们并不相配。此外,可以在http://www.grants.gov/help/download_-software.jsp\pureedge“>.xfdl查看器。查看者说重新编码的xfdl无法读取。

    我也尝试过uEnview在base64中重新编码,它也会产生相同的结果。任何帮助都将不胜感激。

    8 回复  |  直到 8 年前
        1
  •  2
  •   John Downey    16 年前

    据我所知,你找不到已经压缩文件的压缩级别。压缩文件时,可以使用-#指定压缩级别,其中#是从1到9(1是最快的压缩,9是最压缩的文件)。在实践中,您不应该将压缩文件与已提取和重新压缩的文件进行比较,很容易出现细微的变化。在您的例子中,我将比较base64编码的版本而不是gzip的版本。

        2
  •  1
  •   CrazyPyro    14 年前

    http://www.ourada.org/blog/archives/375

    http://www.ourada.org/blog/archives/390

    它们是Python的,而不是Ruby的,但这应该会让您非常接近。

    该算法实际上是针对头文件为“application/x-xfdl;content encoding=“asc gzip””而不是“application”的文件/越南盾;content encoding=“base64 gzip”' 但好消息是PureEdge(又名ibmlotusforms)将毫无问题地打开该格式。

    最后,这里有一个base64 gzip解码(Python语言),这样您就可以进行完整的往返:

    with open(filename, 'r') as f:
      header = f.readline()
      if header == 'application/vnd.xfdl; content-encoding="base64-gzip"\n':
        decoded = b''
        for line in f:
          decoded += base64.b64decode(line.encode("ISO-8859-1"))
        xml = zlib.decompress(decoded, zlib.MAX_WBITS + 16)
    
        3
  •  1
  •   MrWizard54    14 年前

    我在Java中借助于 http://iharder.net/base64 .

    我一直在开发一个应用程序来用Java进行表单操作。我对文件进行解码,从XML创建一个DOM文档,然后将其写回文件。

    我在Java中读取文件的代码如下所示:

    public XFDLDocument(String inputFile) 
            throws IOException, 
                ParserConfigurationException,
                SAXException
    
    {
        fileLocation = inputFile;
    
        try{
    
            //create file object
            File f = new File(inputFile);
            if(!f.exists()) {
                throw new IOException("Specified File could not be found!");
            }
    
            //open file stream from file
            FileInputStream fis = new FileInputStream(inputFile);
    
            //Skip past the MIME header
            fis.skip(FILE_HEADER_BLOCK.length());   
    
            //Decompress from base 64                   
            Base64.InputStream bis = new Base64.InputStream(fis, 
                    Base64.DECODE);
    
            //UnZIP the resulting stream
            GZIPInputStream gis = new GZIPInputStream(bis);
    
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            doc = db.parse(gis);
    
            gis.close();
            bis.close();
            fis.close();
    
        }
        catch (ParserConfigurationException pce) {
            throw new ParserConfigurationException("Error parsing XFDL from file.");
        }
        catch (SAXException saxe) {
            throw new SAXException("Error parsing XFDL into XML Document.");
        }
    }
    

    我的java代码如下所示将文件写入磁盘:

        /**
         * Saves the current document to the specified location
         * @param destination Desired destination for the file.
         * @param asXML True if output needs should be as un-encoded XML not Base64/GZIP
         * @throws IOException File cannot be created at specified location
         * @throws TransformerConfigurationExample
         * @throws TransformerException 
         */
        public void saveFile(String destination, boolean asXML) 
            throws IOException, 
                TransformerConfigurationException, 
                TransformerException  
            {
    
            BufferedWriter bf = new BufferedWriter(new FileWriter(destination));
            bf.write(FILE_HEADER_BLOCK);
            bf.newLine();
            bf.flush();
            bf.close();
    
            OutputStream outStream;
            if(!asXML) {
                outStream = new GZIPOutputStream(
                    new Base64.OutputStream(
                            new FileOutputStream(destination, true)));
            } else {
                outStream = new FileOutputStream(destination, true);
            }
    
            Transformer t = TransformerFactory.newInstance().newTransformer();
            t.transform(new DOMSource(doc), new StreamResult(outStream));
    
            outStream.flush();
            outStream.close();      
        }
    

    希望有帮助。

        4
  •  1
  •   Ross Bielski    13 年前

    我一直在做类似的事情,这应该适用于php。必须有一个可写的tmp文件夹,并且php文件名为示例.php!

        <?php
        function gzdecode($data) {
            $len = strlen($data);
            if ($len < 18 || strcmp(substr($data,0,2),"\x1f\x8b")) {
                echo "FILE NOT GZIP FORMAT";
                return null;  // Not GZIP format (See RFC 1952)
            }
            $method = ord(substr($data,2,1));  // Compression method
            $flags  = ord(substr($data,3,1));  // Flags
            if ($flags & 31 != $flags) {
                // Reserved bits are set -- NOT ALLOWED by RFC 1952
                echo "RESERVED BITS ARE SET. VERY BAD";
                return null;
            }
            // NOTE: $mtime may be negative (PHP integer limitations)
            $mtime = unpack("V", substr($data,4,4));
            $mtime = $mtime[1];
            $xfl   = substr($data,8,1);
            $os    = substr($data,8,1);
            $headerlen = 10;
            $extralen  = 0;
            $extra     = "";
            if ($flags & 4) {
                // 2-byte length prefixed EXTRA data in header
                if ($len - $headerlen - 2 < 8) {
                    return false;    // Invalid format
                    echo "INVALID FORMAT";
                }
                $extralen = unpack("v",substr($data,8,2));
                $extralen = $extralen[1];
                if ($len - $headerlen - 2 - $extralen < 8) {
                    return false;    // Invalid format
                    echo "INVALID FORMAT";
                }
                $extra = substr($data,10,$extralen);
                $headerlen += 2 + $extralen;
            }
    
            $filenamelen = 0;
            $filename = "";
            if ($flags & 8) {
                // C-style string file NAME data in header
                if ($len - $headerlen - 1 < 8) {
                    return false;    // Invalid format
                    echo "INVALID FORMAT";
                }
                $filenamelen = strpos(substr($data,8+$extralen),chr(0));
                if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) {
                    return false;    // Invalid format
                    echo "INVALID FORMAT";
                }
                $filename = substr($data,$headerlen,$filenamelen);
                $headerlen += $filenamelen + 1;
            }
    
            $commentlen = 0;
            $comment = "";
            if ($flags & 16) {
                // C-style string COMMENT data in header
                if ($len - $headerlen - 1 < 8) {
                    return false;    // Invalid format
                    echo "INVALID FORMAT";
                }
                $commentlen = strpos(substr($data,8+$extralen+$filenamelen),chr(0));
                if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) {
                    return false;    // Invalid header format
                    echo "INVALID FORMAT";
                }
                $comment = substr($data,$headerlen,$commentlen);
                $headerlen += $commentlen + 1;
            }
    
            $headercrc = "";
            if ($flags & 1) {
                // 2-bytes (lowest order) of CRC32 on header present
                if ($len - $headerlen - 2 < 8) {
                    return false;    // Invalid format
                    echo "INVALID FORMAT";
                }
                $calccrc = crc32(substr($data,0,$headerlen)) & 0xffff;
                $headercrc = unpack("v", substr($data,$headerlen,2));
                $headercrc = $headercrc[1];
                if ($headercrc != $calccrc) {
                    echo "BAD CRC";
                    return false;    // Bad header CRC
                }
                $headerlen += 2;
            }
    
            // GZIP FOOTER - These be negative due to PHP's limitations
            $datacrc = unpack("V",substr($data,-8,4));
            $datacrc = $datacrc[1];
            $isize = unpack("V",substr($data,-4));
            $isize = $isize[1];
    
            // Perform the decompression:
            $bodylen = $len-$headerlen-8;
            if ($bodylen < 1) {
                // This should never happen - IMPLEMENTATION BUG!
                echo "BIG OOPS";
                return null;
            }
            $body = substr($data,$headerlen,$bodylen);
            $data = "";
            if ($bodylen > 0) {
                switch ($method) {
                    case 8:
                        // Currently the only supported compression method:
                        $data = gzinflate($body);
                        break;
                    default:
                        // Unknown compression method
                        echo "UNKNOWN COMPRESSION METHOD";
                    return false;
                }
            } else {
                // I'm not sure if zero-byte body content is allowed.
                // Allow it for now...  Do nothing...
                echo "ITS EMPTY";
            }
    
            // Verifiy decompressed size and CRC32:
            // NOTE: This may fail with large data sizes depending on how
            //       PHP's integer limitations affect strlen() since $isize
            //       may be negative for large sizes.
            if ($isize != strlen($data) || crc32($data) != $datacrc) {
                // Bad format!  Length or CRC doesn't match!
                echo "LENGTH OR CRC DO NOT MATCH";
                return false;
            }
            return $data;
        }
        echo "<html><head></head><body>";
        if (empty($_REQUEST['upload'])) {
            echo <<<_END
        <form enctype="multipart/form-data" action="example.php" method="POST">
        <input type="hidden" name="MAX_FILE_SIZE" value="100000" />
        <table>
        <th>
        <input name="uploadedfile" type="file" />
        </th>
        <tr>
        <td><input type="submit" name="upload" value="Convert File" /></td>
        </tr>
        </table>
        </form>
        _END;
    
        }
        if (!empty($_REQUEST['upload'])) {
            $file           = "tmp/" . $_FILES['uploadedfile']['name'];
            $orgfile        = $_FILES['uploadedfile']['name'];
            $name           = str_replace(".xfdl", "", $orgfile);
            $convertedfile  = "tmp/" . $name . ".xml";
            $compressedfile = "tmp/" . $name . ".gz";
            $finalfile      = "tmp/" . $name . "new.xfdl";
            $target_path    = "tmp/";
            $target_path    = $target_path . basename($_FILES['uploadedfile']['name']);
            if (move_uploaded_file($_FILES['uploadedfile']['tmp_name'], $target_path)) {
            } else {
                echo "There was an error uploading the file, please try again!";
            }
            $firstline      = "application/vnd.xfdl; content-encoding=\"base64-gzip\"\n";
            $data           = file($file);
            $data           = array_slice($data, 1);
            $raw            = implode($data);
            $decoded        = base64_decode($raw);
            $decompressed   = gzdecode($decoded);
            $compressed     = gzencode($decompressed);
            $encoded        = base64_encode($compressed);
            $decoded2       = base64_decode($encoded);
            $decompressed2  = gzdecode($decoded2);
            $header         = bin2hex(substr($decoded, 0, 10));
            $tail           = bin2hex(substr($decoded, -8));
            $header2        = bin2hex(substr($compressed, 0, 10));
            $tail2          = bin2hex(substr($compressed, -8));
            $header3        = bin2hex(substr($decoded2, 0, 10));
            $tail3          = bin2hex(substr($decoded2, -8));
            $filehandle     = fopen($compressedfile, 'w');
            fwrite($filehandle, $decoded);
            fclose($filehandle);
            $filehandle     = fopen($convertedfile, 'w');
            fwrite($filehandle, $decompressed);
            fclose($filehandle);
            $filehandle     = fopen($finalfile, 'w');
            fwrite($filehandle, $firstline);
            fwrite($filehandle, $encoded);
            fclose($filehandle);
            echo "<center>";
            echo "<table style='text-align:center' >";
            echo "<tr><th>Stage 1</th>";
            echo "<th>Stage 2</th>";
            echo "<th>Stage 3</th></tr>";
            echo "<tr><td>RAW DATA -></td><td>DECODED DATA -></td><td>UNCOMPRESSED DATA -></td></tr>";
            echo "<tr><td>LENGTH: ".strlen($raw)."</td>";
            echo "<td>LENGTH: ".strlen($decoded)."</td>";
            echo "<td>LENGTH: ".strlen($decompressed)."</td></tr>";
            echo "<tr><td><a href='tmp/".$orgfile."'/>ORIGINAL</a></td><td>GZIP HEADER:".$header."</td><td><a href='".$convertedfile."'/>XML CONVERTED</a></td></tr>";
            echo "<tr><td></td><td>GZIP TAIL:".$tail."</td><td></td></tr>";
            echo "<tr><td><textarea cols='30' rows='20'>" . $raw . "</textarea></td>";
            echo "<td><textarea cols='30' rows='20'>" . $decoded . "</textarea></td>";
            echo "<td><textarea cols='30' rows='20'>" . $decompressed . "</textarea></td></tr>";
            echo "<tr><th>Stage 6</th>";
            echo "<th>Stage 5</th>";
            echo "<th>Stage 4</th></tr>";
            echo "<tr><td>ENCODED DATA <-</td><td>COMPRESSED DATA <-</td><td>UNCOMPRESSED DATA <-</td></tr>";
            echo "<tr><td>LENGTH: ".strlen($encoded)."</td>";
            echo "<td>LENGTH: ".strlen($compressed)."</td>";
            echo "<td>LENGTH: ".strlen($decompressed)."</td></tr>";
            echo "<tr><td></td><td>GZIP HEADER:".$header2."</td><td></td></tr>";
            echo "<tr><td></td><td>GZIP TAIL:".$tail2."</td><td></td></tr>";
            echo "<tr><td><a href='".$finalfile."'/>FINAL FILE</a></td><td><a href='".$compressedfile."'/>RE-COMPRESSED FILE</a></td><td></td></tr>";
            echo "<tr><td><textarea cols='30' rows='20'>" . $encoded . "</textarea></td>";
            echo "<td><textarea cols='30' rows='20'>" . $compressed . "</textarea></td>";
            echo "<td><textarea cols='30' rows='20'>" . $decompressed  . "</textarea></td></tr>";
            echo "</table>";
            echo "</center>";
        }
        echo "</body></html>";
        ?>
    
        5
  •  1
  •   Markus Safar    9 年前

    您需要将以下行放在XFDL文件的开头:

    application/vnd.xfdl; content-encoding="base64-gzip"

    保存并在查看器中尝试!如果它仍然不起作用,可能是对XML所做的更改在某种程度上使它不兼容。在本例中,在修改XML之后,但在对其进行gzip压缩和base64编码之前,请使用.xfdl文件扩展名保存它,然后尝试使用查看器工具打开它。如果未压缩/未编码的文件是有效的XFDL格式,则查看器应该能够解析和显示该文件。

        6
  •  0
  •   John Downey    16 年前

    gzip算法的不同实现将始终生成稍有不同但仍然正确的文件,而且原始文件的压缩级别可能与运行它的位置不同。

        7
  •  0
  •   Alex Lehmann    15 年前

    gzip将把文件名放在文件头中,这样gzip压缩文件的长度会根据未压缩文件的文件名而变化。

    如果gzip作用于流上,文件名会被省略,文件也会短一点,因此应该可以使用以下方法:

    gzip格式-解压缩的.xml.gz

    然后在base-64中重新编码: base64-你的形式-解包.xml.gz你的形式_重新编码.xfdl

    也许这会产生一个相同长度的文件

        8
  •  0
  •   Markus Safar    9 年前

    有意思,我试试看。不过,变化并不小。新编码的文件较长,当比较前后的二进制文件时,数据几乎不匹配。

    H4sIAAAAAAAAC+19eZOiyNb3/34K3r4RT/WEU40ssvTtrhuIuKK44Bo3YoJdFAFZ3D79C6hVVhUq
    dsnUVN/qmIkSOLlwlt/JPCfJ/PGf9dwAlorj6pb58wv0LfcFUEzJknVT+/ml2uXuCSJP3kNf/vOQ
    +TEsFVkgoDfdn18mnmd/B8HVavWt5TsKI2vKN8magyENiH3Lf9kRfpd817PmF+jpiOhQRFZcXTMV
    

    后(前三行):

    H4sICJ/YnEgAAzEyNDQ2LTExNjk2NzUueGZkbC54bWwA7D1pU+JK19/9FV2+H5wpByEhJMRH
    uRUgCMom4DBYt2oqkAZyDQlmQZ1f/3YSNqGzKT3oDH6RdE4vOXuf08vFP88TFcygYSq6dnlM
    naWOAdQGuqxoo8vjSruRyGYzfII6/id3dPGjVKwCBK+Zl8djy5qeJ5NPT09nTduAojyCZwN9
    

    如你所见 H4SI 配对,然后就是混乱。