使用Orc mergeFile Core API合并Hive ORC表的外部文件?

时间:2018-01-20 18:43:45

标签: java apache hadoop hive orc

我在Hive ORC表的外部路径中有orc文件。我想使用Orc Core API将此路径中的ORC文件合并为单个文件:

public static List<Path> mergeFiles(Path outputPath,
                                    OrcFile.WriterOptions options,
                                    List<Path> inputFiles)
                             throws IOException
Merges multiple ORC files that all have the same schema to produce a single ORC file. The merge will reject files that aren't compatible with the merged file so the output list may be shorter than the input list. The stripes are copied as serialized byte buffers. The user metadata are merged and files that disagree on the value associated with a key will be rejected.

为此,我需要使用来自hive的外部表填充的tableProperties初始化WriterOptions(Properties tableProperties, Configuration conf)。对此有何帮助非常感谢?

0 个答案:

没有答案