如何使用apache poi从ms word中提取段落文本颜色

时间:2010-11-15 09:30:49

标签: apache-poi

我正在使用apache POI,是否可以从ms字段

中读取文本背景和前景色

4 个答案:

答案 0 :(得分:5)

我得到了解决方案

            HWPFDocument doc = new HWPFDocument(fs);
            WordExtractor we = new WordExtractor(doc);
            Range range = doc.getRange();       
            String[] paragraphs = we.getParagraphText();
            for (int i = 0; i < paragraphs.length; i++) {
                org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);

                System.out.println(pr.getEndOffset());
                int j=0;
                while (true) {              
                 CharacterRun run = pr.getCharacterRun(j++);
                 System.out.println("-------------------------------");             
                 System.out.println("Color---"+ run.getColor());
                 System.out.println("getFontName---"+ run.getFontName());
                 System.out.println("getFontSize---"+ run.getFontSize());           

                if( run.getEndOffset()==pr.getEndOffset()){
                    break;
                }
                }
}

答案 1 :(得分:2)

我发现了:

CharacterRun run = para.getCharacterRun(i)

i应该是整数,并且应该递增,因此代码如下:

int c=0;
while (true) {
    CharacterRun run = para.getCharacterRun(c++);
    int x = run.getPicOffset();
    System.out.println("pic offset" + x);
    if (run.getEndOffset() == para.getEndOffset()) {
       break;
    }
}

答案 2 :(得分:0)

  if (paragraph != null)
            {
                int numberOfRuns = paragraph.NumCharacterRuns;
                for (int runIndex = 0; runIndex < numberOfRuns; runIndex++)
                {
                    CharacterRun run = paragraph.GetCharacterRun(runIndex);
                    string color = getColor24(run.GetIco24());

                }
  }

GetColor24函数以C#

的十六进制格式转换颜色
     public static String getColor24(int argbValue)
    {
        if (argbValue == -1)
            return "";

        int bgrValue = argbValue & 0x00FFFFFF;
        int rgbValue = (bgrValue & 0x0000FF) << 16 | (bgrValue & 0x00FF00)
                | (bgrValue & 0xFF0000) >> 16;

        StringBuilder result = new StringBuilder("#");
        String hex = rgbValue.ToString("X");
        for (int i = hex.Length; i < 6; i++)
        {
            result.Append('0');
        }
        result.Append(hex);
        return result.ToString();
    }

答案 3 :(得分:0)

如果你正在使用docx(OOXML),你可能想看看这个:

import java.io.*
import org.apache.poi.xwpf.usermodel.XWPFDocument


fun test(){
   try {
            val file = File("file.docx")
            val fis = FileInputStream(file.absolutePath)
            val document = XWPFDocument(fis)
            val paragraphs = document.paragraphs

            for (para in paragraphs) {
                println("-- ("+para.alignment+") " + para.text)

                para.runs.forEach { it ->
                    println(
                            "text:" + it.text() + " "
                                    + "(color:" + it.color
                                    + ",fontFamily:" + it.fontFamily
                                    + ")"

                    )
                }

            }

            fis.close()
        } catch (e: Exception) {
            e.printStackTrace()
        }
}