FileChannel和ByteBuffer写入额外的数据

时间:2019-07-13 15:44:20

标签: java io bytebuffer filechannel parity

我正在创建一种方法,该方法将接收一个文件并将其拆分为shardCount个片段并生成一个奇偶校验文件。

当我运行此方法时,似乎我正在将多余的数据写到我的奇偶校验文件中。这是我第一次使用FileChannel和ByteBuffers,因此尽管盯着文档大约8个小时,但我不确定我是否完全了解如何使用它们。

此代码是奇偶校验部分的简化版本。

public static void splitAndGenerateParityFile(File file, int shardCount, String fileID) throws IOException {
    RandomAccessFile rin = new RandomAccessFile(file, "r");
    FileChannel fcin = rin.getChannel();

    //Create parity files
    File parity = new File(fileID + "_parity");
    if (parity.exists()) throw new FileAlreadyExistsException("Could not create parity file! File already exists!");
    RandomAccessFile parityRAF = new RandomAccessFile(parity, "rw");
    FileChannel parityOut = parityRAF.getChannel();

    long bytesPerFile = (long) Math.ceil(rin.length() / shardCount);

    //Make buffers for each section of the file we will be reading from
    for (int i = 0; i < shardCount; i++) {
        ByteBuffer bb = ByteBuffer.allocate(1024);
        shardBuffers.add(bb);
    }

    ByteBuffer parityBuffer = ByteBuffer.allocate(1024);

    //Generate parity
    boolean isParityBufferEmpty = true;
    for (long i = 0; i < bytesPerFile; i++) {
        isParityBufferEmpty = false;
        int pos = (int) (i % 1024);
        byte p = 0;

        if (pos == 0) {
            //Read chunk of file into each buffer
            for (int j = 0; j < shardCount; j++) {
                ByteBuffer bb = shardBuffers.get(j);
                bb.clear();
                fcin.read(bb, bytesPerFile * j + i);
                bb.rewind();
            }
            //Dump parity buffer
            if (i > 0) {
                parityBuffer.rewind();
                parityOut.write(parityBuffer);
                parityBuffer.clear();
                isParityBufferEmpty = true;
            }
        }

        //Get parity
        for (ByteBuffer bb : shardBuffers) {
            if (pos >= bb.limit()) break;
            p ^= bb.get(pos);
        }

        //Put parity in buffer
        parityBuffer.put(pos, p);
    }

    if (!isParityBufferEmpty) {
        parityBuffer.rewind();
        parityOut.write(parityBuffer);
        parityBuffer.clear();
    }

    fcin.close();
    rin.close();
    parityOut.close();
    parityRAF.close();
}

请让我知道奇偶校验算法或文件IO是否存在任何问题,或者我可以做些什么来优化它。我很高兴听到其他(更好的)文件IO处理方式。

1 个答案:

答案 0 :(得分:0)

这是我找到的解决方案(尽管可能需要更多调整):

public static void splitAndGenerateParityFile(File file, int shardCount, String fileID) throws IOException {
    int BUFFER_SIZE = 4 * 1024 * 1024;
    RandomAccessFile rin = new RandomAccessFile(file, "r");
    FileChannel fcin = rin.getChannel();

    //Create parity files
    File parity = new File(fileID + "_parity");
    if (parity.exists()) throw new FileAlreadyExistsException("Could not create parity file! File already exists!");
    RandomAccessFile parityRAF = new RandomAccessFile(parity, "rw");
    FileChannel parityOut = parityRAF.getChannel();

    //Create shard files
    ArrayList<File> shards = new ArrayList<>(shardCount);
    for (int i = 0; i < shardCount; i++) {
        File f = new File(fileID + "_part_" + i);
        if (f.exists()) throw new FileAlreadyExistsException("Could not create shard file! File already exists!");
        shards.add(f);
    }

    long bytesPerFile = (long) Math.ceil(rin.length() / shardCount);

    ArrayList<ByteBuffer> shardBuffers = new ArrayList<>(shardCount);

    //Make buffers for each section of the file we will be reading from
    for (int i = 0; i < shardCount; i++) {
        ByteBuffer bb = ByteBuffer.allocate(BUFFER_SIZE);
           shardBuffers.add(bb);
    }

    ByteBuffer parityBuffer = ByteBuffer.allocate(BUFFER_SIZE);

    //Generate parity
    boolean isParityBufferEmpty = true;
    for (long i = 0; i < bytesPerFile; i++) {
        isParityBufferEmpty = false;
        int pos = (int) (i % BUFFER_SIZE);
        byte p = 0;

        if (pos == 0) {
            //Read chunk of file into each buffer
            for (int j = 0; j < shardCount; j++) {
                ByteBuffer bb = shardBuffers.get(j);
                bb.clear();
                fcin.position(bytesPerFile * j + i);
                fcin.read(bb);
                bb.flip();
            }

            //Dump parity buffer
            if (i > 0) {
                parityBuffer.flip();
                while (parityBuffer.hasRemaining()) {
                    parityOut.write(parityBuffer);
                }
                parityBuffer.clear();
                isParityBufferEmpty = true;
            }
        }

        //Get parity
        for (ByteBuffer bb : shardBuffers) {
            if (!bb.hasRemaining()) break;
            p ^= bb.get();
        }

        //Put parity in buffer
        parityBuffer.put(p);
    }

    if (!isParityBufferEmpty) {
        parityBuffer.flip();
        parityOut.write(parityBuffer);
        parityBuffer.clear();
    }

    fcin.close();
    rin.close();
    parityOut.close();
    parityRAF.close();
}

如VGR所建议,我将rewind()替换为flip()。我也切换到相对操作,而不是绝对操作。我不认为绝对方法可以调整光标位置或限制,因此很可能是导致错误的原因。我也将缓冲区大小更改为4MB,因为我对生成大文件的奇偶校验感兴趣。