代码之家  ›  专栏  ›  技术社区  ›  ant

如何获取值为Stax的纯元素元素

  •  1
  • ant  · 技术社区  · 14 年前

    我尝试只获取包含文本的元素,例如xml:

    <root>
          <Item>
            <ItemID>4504216603</ItemID>
            <ListingDetails>
              <StartTime>10:00:10.000Z</StartTime>
              <EndTime>10:00:30.000Z</EndTime>
              <ViewItemURL>http://url</ViewItemURL>
                ....
               </item> 
    

    应该打印出来

    Element Local Name:ItemID
    Text:4504216603
    Element Local Name:StartTime
    Text:10:00:10.000Z
    Element Local Name:EndTime
    Text:10:00:30.000Z
    Element Local Name:ViewItemURL
    Text:http://url
    

    XMLInputFactory inputFactory = XMLInputFactory.newInstance();
    InputStream input = new FileInputStream(new File("src/main/resources/file.xml"));
    XMLStreamReader xmlStreamReader = inputFactory.createXMLStreamReader(input);
    
    while (xmlStreamReader.hasNext()) {
        int event = xmlStreamReader.next();
    
        if (event == XMLStreamConstants.START_ELEMENT) {
        System.out.println("Element Local Name:" + xmlStreamReader.getLocalName());
        }
    
        if (event == XMLStreamConstants.CHARACTERS) {
                            if(!xmlStreamReader.getText().trim().equals("")){
                            System.out.println("Text:"+xmlStreamReader.getText().trim());
                            }
                    }
    
                }
    

    编辑错误行为 :

        Element Local Name:root
        Element Local Name:item
        Element Local Name:ItemID
        Text:4504216603
        Element Local Name:ListingDetails
        Element Local Name:StartTime
        Text:10:00:10.000Z
        Element Local Name:EndTime
        Text:10:00:30.000Z
        Element Local Name:ViewItemURL
        Text:http://url
    

    我不希望根和其他没有文本的节点被打印出来,只是我上面写的输出。谢谢您

    2 回复  |  直到 14 年前
        1
  •  2
  •   Georgy Bolyuba    14 年前

    试试这个:

    while (xmlStreamReader.hasNext()) {
        int event = xmlStreamReader.next();
    
        if (event == XMLStreamConstants.START_ELEMENT) {
            try {
                String text = xmlStreamReader.getElementText();
                System.out.println("Element Local Name:" + xmlStreamReader.getLocalName());
                System.out.println("Text:" + text);
            } catch (XMLStreamException e) {
    
            }
        }
    
    }
    

    基于SAX的解决方案(works):

    public class Test extends DefaultHandler {
    
        public static void main(String[] args) throws ParserConfigurationException, IOException, SAXException, XPathExpressionException, XMLStreamException {
            SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
            parser.parse(new File("src/file.xml"), new Test());
        }
    
        private String currentName;
    
        @Override
        public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
            currentName = qName;
        }
    
        @Override
        public void characters(char[] ch, int start, int length) throws SAXException {
            String string = new String(ch, start, length);
            if (hasText(string)) {
                System.out.println(currentName);
                System.out.println(string);
            }
        }
    
        private boolean hasText(String string) {
            string = string.trim();
            return string.length() > 0;
        }
    }
    
        2
  •  0
  •   ant    14 年前

    Stax解决方案:

    解析文档

    public void parseXML(InputStream xml) {
            try {
    
                DOMResult result = new DOMResult();
                XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
                XMLEventReader reader = xmlInputFactory.createXMLEventReader(new StreamSource(xml));
                TransformerFactory transFactory = TransformerFactory.newInstance();
                Transformer transformer = transFactory.newTransformer();
                transformer.transform(new StAXSource(reader), result);
                Document document = (Document) result.getNode();
    
                NodeList startlist = document.getChildNodes();
    
                processNodeList(startlist);
    
            } catch (Exception e) {
                System.err.println("Something went wrong, this might help :\n" + e.getMessage());
            }
        }
    

    现在文档中的所有节点都位于节点列表中,因此接下来请执行以下操作:

    private void processNodeList(NodeList nodelist) {
            for (int i = 0; i < nodelist.getLength(); i++) {
                if (nodelist.item(i).getNodeType() == Node.ELEMENT_NODE && (hasValidAttributes(nodelist.item(i)) || hasValidText(nodelist.item(i)))) {
                    getNodeNamesAndValues(nodelist.item(i));
                }
                processNodeList(nodelist.item(i).getChildNodes());
            }
        }
    

    public void getNodeNamesAndValues(Node n) {
    
            String nodeValue = null;
            String nodeName = null;
    
            if (hasValidText(n)) {
                while (n != null && isWhiteSpace(n.getTextContent()) == true && StringUtils.isWhitespace(n.getTextContent()) && n.getNodeType() != Node.ELEMENT_NODE) {
                    n = n.getFirstChild();
                }
    
                nodeValue = StringUtils.strip(n.getTextContent());
                nodeName = n.getLocalName();
    
                System.out.println(nodeName + " " + nodeValue);
    
            }
        }
    

    检查节点的一系列有用方法:

    private static boolean hasValidAttributes(Node node) {
            return (node.getAttributes().getLength() > 0);
    
        }
    
    private boolean hasValidText(Node node) {
            String textValue = node.getTextContent();
    
            return (textValue != null && textValue != "" && isWhiteSpace(textValue) == false && !StringUtils.isWhitespace(textValue) && node.hasChildNodes());
        }
    
    private boolean isWhiteSpace(String nodeText) {
            if (nodeText.startsWith("\r") || nodeText.startsWith("\t") || nodeText.startsWith("\n") || nodeText.startsWith(" "))
                return true;
            else
                return false;
        }
    

    我还使用了StringUtils,如果您使用的是maven,可以通过在pom.xml中包含以下内容来实现:

    <dependency>
                <groupId>commons-lang</groupId>
                <artifactId>commons-lang</artifactId>
                <version>2.5</version>
            </dependency>
    

    如果您正在读取巨大的文件,这是低效的,但如果您首先拆分它们,效率就不会太高。这就是我(用谷歌)带来的。有更多更好的解决方案这是我的,我是一个业余爱好者(目前)。