Hadoop:是否可以避免某些文件的复制?

时间:2015-07-30 21:03:06

标签: hadoop hdfs replication

在hdfs中我理解所有文件都被复制了,但是我们在我们的工作中做了某些日志记录,我们不希望复制的文件,因为它可能会不必要地维护复制的副本,是否可以这样做?即为了避免仅复制日志文件。?

1 个答案:

答案 0 :(得分:2)

您可以使用-setrep标志和hadoop fs shell命令设置复制。

Usage: hadoop fs -setrep [-R] [-w] <numReplicas> <path>

Changes the replication factor of a file. If path is a directory then the command recursively changes the replication factor of all files under the directory tree rooted at path.

Options:

The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
The -R flag is accepted for backwards compatibility. It has no effect.
Example:

hadoop fs -setrep -w 3 /user/hadoop/dir1

为避免复制,您可以将numReplicas设置为1.