代码之家  ›  专栏  ›  技术社区  ›  Arun

如何使用apache poi从ms-word中提取段落文本颜色

  •  1
  • Arun  · 技术社区  · 14 年前

    我正在使用apache POI,是否可以从ms word段落中读取文本背景和前景颜色

    4 回复  |  直到 14 年前
        1
  •  5
  •   Arun    14 年前

    我找到了解决办法

                HWPFDocument doc = new HWPFDocument(fs);
                WordExtractor we = new WordExtractor(doc);
                Range range = doc.getRange();       
                String[] paragraphs = we.getParagraphText();
                for (int i = 0; i < paragraphs.length; i++) {
                    org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
    
                    System.out.println(pr.getEndOffset());
                    int j=0;
                    while (true) {              
                     CharacterRun run = pr.getCharacterRun(j++);
                     System.out.println("-------------------------------");             
                     System.out.println("Color---"+ run.getColor());
                     System.out.println("getFontName---"+ run.getFontName());
                     System.out.println("getFontSize---"+ run.getFontSize());           
    
                    if( run.getEndOffset()==pr.getEndOffset()){
                        break;
                    }
                    }
    }
    
        2
  •  2
  •   cyberdelia nada    12 年前

    我是在:

    CharacterRun run = para.getCharacterRun(i)
    

    i 应该是整数,并且应该递增,因此代码如下:

    int c=0;
    while (true) {
        CharacterRun run = para.getCharacterRun(c++);
        int x = run.getPicOffset();
        System.out.println("pic offset" + x);
        if (run.getEndOffset() == para.getEndOffset()) {
           break;
        }
    }
    
        3
  •  0
  •   Muhammad Yousaf Sulahria    10 年前
      if (paragraph != null)
                {
                    int numberOfRuns = paragraph.NumCharacterRuns;
                    for (int runIndex = 0; runIndex < numberOfRuns; runIndex++)
                    {
                        CharacterRun run = paragraph.GetCharacterRun(runIndex);
                        string color = getColor24(run.GetIco24());
    
                    }
      }
    

    GetColor24函数转换C的十六进制格式的颜色#

         public static String getColor24(int argbValue)
        {
            if (argbValue == -1)
                return "";
    
            int bgrValue = argbValue & 0x00FFFFFF;
            int rgbValue = (bgrValue & 0x0000FF) << 16 | (bgrValue & 0x00FF00)
                    | (bgrValue & 0xFF0000) >> 16;
    
            StringBuilder result = new StringBuilder("#");
            String hex = rgbValue.ToString("X");
            for (int i = hex.Length; i < 6; i++)
            {
                result.Append('0');
            }
            result.Append(hex);
            return result.ToString();
        }
    
        4
  •  0
  •   LaFei    8 年前

    如果您正在处理docx(OOXML),您可能需要查看以下内容:

    import java.io.*
    import org.apache.poi.xwpf.usermodel.XWPFDocument
    
    
    fun test(){
       try {
                val file = File("file.docx")
                val fis = FileInputStream(file.absolutePath)
                val document = XWPFDocument(fis)
                val paragraphs = document.paragraphs
    
                for (para in paragraphs) {
                    println("-- ("+para.alignment+") " + para.text)
    
                    para.runs.forEach { it ->
                        println(
                                "text:" + it.text() + " "
                                        + "(color:" + it.color
                                        + ",fontFamily:" + it.fontFamily
                                        + ")"
    
                        )
                    }
    
                }
    
                fis.close()
            } catch (e: Exception) {
                e.printStackTrace()
            }
    }
    
    推荐文章