使用sheet.shiftRow()写入现有xlsx文件时POI变慢

时间:2016-06-06 10:26:10

标签: java excel apache-poi

您好我正在使用Apache POI 3.11

我尝试将大约50,000行的数据写入现有的xlsx文件中。我的表现很慢(约3分钟)。所以我决定使用这里所述的SSPerformanceTest https://poi.apache.org/faq.html#faq-N10165

I discovered that the slowness is from calling sheet.createRow(); after sheet.shiftRow();

当我开箱即用SSPerformanceTest测试时,它表现非常好(大约10秒就有50,000行),但经过一些修改以满足我的要求后,它变得非常糟糕。

我需要shiftRow,因为excel文件将包含文件底部的内容,如签名等。

这里的代码很好(有一些调试记录时间已经过去)

private static void addContent(Workbook workBook, boolean isHType, int rows, int cols) {
    Map<String, CellStyle> styles = createStyles(workBook);

    Sheet sheet = workBook.getSheetAt(0);


    long startShiftRow = System.currentTimeMillis();
    long endShiftRow = System.currentTimeMillis();
    System.out.println("###@@@ Elapsed shift rows " + (endShiftRow - startShiftRow) + " milli seconds");

    Cell headerCell = sheet.createRow(0).createCell(0);
    headerCell.setCellValue("Header text is spanned across multiple cells");
    headerCell.setCellStyle(styles.get("header"));
    sheet.addMergedRegion(CellRangeAddress.valueOf("$A$1:$F$1"));

    int sheetNo = 0;
    int rowIndexInSheet = 1;
    double value = 0;
    Calendar calendar = Calendar.getInstance();
    double totalCreateRowTime = 0;
    double totalSetValue = 0;
    for (int rowIndex = 0; rowIndex < rows; rowIndex++) {
        if (isHType && sheetNo != rowIndex / 0x10000) {
            sheet = workBook.createSheet("Spillover from sheet " + (++sheetNo));
            headerCell.setCellValue("Header text is spanned across multiple cells");
            headerCell.setCellStyle(styles.get("header"));
            sheet.addMergedRegion(CellRangeAddress.valueOf("$A$1:$F$1"));
            rowIndexInSheet = 1;
        }
        long startCreateRow = System.currentTimeMillis();
        Row row = sheet.createRow(rowIndexInSheet);
        long endCreateRow = System.currentTimeMillis();
        totalCreateRowTime += endCreateRow - startCreateRow;


        long startSetValue = System.currentTimeMillis();
        for (int colIndex = 0; colIndex < cols; colIndex++) {
            value = populateCell(styles, value, calendar, rowIndex, row, colIndex);
        }
        long endSetValue = System.currentTimeMillis();
        totalSetValue += endSetValue - startSetValue;
        rowIndexInSheet++;
    }
    System.out.println("###@@@ Elapsed average create row " + (totalCreateRowTime / rows) + " milli seconds");
    System.out.println("###@@@ Elapsed average set value " + (totalSetValue / rows) + " milli seconds");
}

然后我添加了移位行

private static void addContent(Workbook workBook, boolean isHType, int rows, int cols) {
    Map<String, CellStyle> styles = createStyles(workBook);

    Sheet sheet = workBook.getSheetAt(0);


    long startShiftRow = System.currentTimeMillis();
    sheet.shiftRows(0, sheet.getLastRowNum(), rows); // I ADD IT HERE, ONLY 1 LINE !!!

    long endShiftRow = System.currentTimeMillis();
    System.out.println("###@@@ Elapsed shift rows " + (endShiftRow - startShiftRow) + " milli seconds");

    Cell headerCell = sheet.createRow(0).createCell(0);
    headerCell.setCellValue("Header text is spanned across multiple cells");
    headerCell.setCellStyle(styles.get("header"));
    sheet.addMergedRegion(CellRangeAddress.valueOf("$A$1:$F$1"));

    int sheetNo = 0;
    int rowIndexInSheet = 1;
    double value = 0;
    Calendar calendar = Calendar.getInstance();
    double totalCreateRowTime = 0;
    double totalSetValue = 0;
    for (int rowIndex = 0; rowIndex < rows; rowIndex++) {
        if (isHType && sheetNo != rowIndex / 0x10000) {
            sheet = workBook.createSheet("Spillover from sheet " + (++sheetNo));
            headerCell.setCellValue("Header text is spanned across multiple cells");
            headerCell.setCellStyle(styles.get("header"));
            sheet.addMergedRegion(CellRangeAddress.valueOf("$A$1:$F$1"));
            rowIndexInSheet = 1;
        }
        long startCreateRow = System.currentTimeMillis();
        Row row = sheet.createRow(rowIndexInSheet);
        long endCreateRow = System.currentTimeMillis();
        totalCreateRowTime += endCreateRow - startCreateRow;


        long startSetValue = System.currentTimeMillis();
        for (int colIndex = 0; colIndex < cols; colIndex++) {
            value = populateCell(styles, value, calendar, rowIndex, row, colIndex);
        }
        long endSetValue = System.currentTimeMillis();
        totalSetValue += endSetValue - startSetValue;
        rowIndexInSheet++;
    }
    System.out.println("###@@@ Elapsed average create row " + (totalCreateRowTime / rows) + " milli seconds");
    System.out.println("###@@@ Elapsed average set value " + (totalSetValue / rows) + " milli seconds");
}

日志结果如下

-- before add shift row --
###@@@ Elapsed shift rows 0 milli seconds
###@@@ Elapsed average create row 0.00442 milli seconds
###@@@ Elapsed average set value 0.19238 milli seconds
###@@@ Elapsed done 14224 milli seconds

-- after add shift row --
###@@@ Elapsed shift rows 139 milli seconds
###@@@ Elapsed average create row 2.93634 milli seconds
###@@@ Elapsed average set value 0.21966 milli seconds
###@@@ Elapsed done 165080 milli seconds

慢速当然不是来自shiftRow本身,因为它只消耗139毫秒。但是,sheet.shiftRow()之后用于sheet.createRow()的时间从0.0044毫秒增加到2.9毫秒超过500次。并且我认为我需要在xlsx文件中写入大约150,000行,这将花费不可接受的时间来处理。

任何其他代码,您可以方便地查看https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/ss/examples/SSPerformanceTest.java或直接询问我。

你知道为什么在调用sheet.shiftRow()后,sheet.createRow()会变得很慢吗?有什么我做错了或有任何解决方法吗?提前谢谢。

=============================================== ==============

编辑1

经过多一点调查,替换了sheet.shiftRow();与

sheet.createRow(50005);

足以使整个事情变得缓慢。

1 个答案:

答案 0 :(得分:0)

经过一些思考和实验,我找到了问题的原因和解决方法。

<强>原因

当我将行移到50000时,它会使excel文件变得太大。即使第1行到第50000行为空,对太大文件的处理也会变得很慢。

<强>解决方案

我将那些想要转移到临时表的行复制,然后正常写入我的50000行记录。之后,我将页脚从临时表中复制回来。所以现在我不需要处理大的excel文件,只需要处理2张小的excel文件。

相关问题