
时间:2013-06-26 20:49:26

标签: bash sed awk command-line-interface

我正在尝试从非常非常大的文件中删除前37行。我开始尝试使用sed和awk,但它们似乎需要将数据复制到新文件中。我正在寻找一种“删除就地线”的方法,与sed -i不同,它不是制作任何类型的副本,而只是从现有文件中删除行。


awk 'NR > 37' file.xml > 'f2.xml'
sed -i '1,37d' file.xml


4 个答案:

答案 0 :(得分:10)

使用UNIX实用程序进行现场编辑没有简单的方法,但这里有一个就地文件修改解决方案,您可以修改它以便为您工作(由Robert Bonomi在https://groups.google.com/forum/#!topic/comp.unix.shell/5PRRZIP0v64提供):

bytes=$(head -37 "$file" |wc -c)
dd if="$file" bs="$bytes" skip=1 conv=notrunc of="$file"


truncate -s "-$bytes" "$file"


$ wc -l file
12 file

$ cat file
When chapman billies leave the street,
And drouthy neibors, neibors, meet;
As market days are wearing late,
And folk begin to tak the gate,
While we sit bousing at the nappy,
An' getting fou and unco happy,
We think na on the lang Scots miles,
The mosses, waters, slaps and stiles,
That lie between us and our hame,
Where sits our sulky, sullen dame,
Gathering her brows like gathering storm,
Nursing her wrath to keep it warm.

首先使用dd从文件的开头删除目标5行(实际上是“$ bytes”字节),然后将其余部分从末尾复制到前面,但保留尾随的“$ bytes”字节为-is:

$ bytes=$(head -5 file |wc -c)

$ dd if=file bs="$bytes" skip=1 conv=notrunc of=file
1+1 records in
1+1 records out
253 bytes copied, 0.0038458 s, 65.8 kB/s

$ wc -l file
12 file

$ cat file
An' getting fou and unco happy,
We think na on the lang Scots miles,
The mosses, waters, slaps and stiles,
That lie between us and our hame,
Where sits our sulky, sullen dame,
Gathering her brows like gathering storm,
Nursing her wrath to keep it warm.
s, waters, slaps and stiles,
That lie between us and our hame,
Where sits our sulky, sullen dame,
Gathering her brows like gathering storm,
Nursing her wrath to keep it warm.


$ truncate -s "-$bytes" "file"

$ wc -l file
7 file

$ cat file
An' getting fou and unco happy,
We think na on the lang Scots miles,
The mosses, waters, slaps and stiles,
That lie between us and our hame,
Where sits our sulky, sullen dame,
Gathering her brows like gathering storm,
Nursing her wrath to keep it warm.

如果我们在没有dd ... conv=notrunc的情况下尝试了上述内容:

$ wc -l file
12 file
$ bytes=$(head -5 file |wc -c)
$ dd if=file bs="$bytes" skip=1 of=file
dd: file: cannot skip to specified offset
0+0 records in
0+0 records out
0 bytes copied, 0.0042254 s, 0.0 kB/s
$ wc -l file
0 file


答案 1 :(得分:6)



  1. 将文件读入内存,然后将其写回(edex,其他编辑者)。如果你的文件是<1GB或者你有足够的RAM,这应该没问题。
  2. 撰写第二个副本并选择替换原始文件(sed -iawk / tail > foo)。只要您有足够的可用磁盘空间用于副本,这样就可以了,并且不介意等待。
  3. 如果文件太大而无法为您提供这些文件,您可以根据阅读文件的内容解决问题。

    也许您的读者会跳过评论或空白行?如果是这样,您可以制作读者忽略的消息,确保它与文件中的第37行具有相同的字节数,并使用dd if=yourdata of=file conv=notrunc覆盖文件的开头。

答案 2 :(得分:4)


ed -s file <<< $'1,37d\nwq'

答案 3 :(得分:2)

必须在某个时刻创建副本 - 为什么不在阅读“修改”文件时;流式传输更改的副本而不是存储它?

我在想什么 - 创建一个命名管道“file2”,它是同一个awk'NR&gt;的输出。 37'file.xml或其他;那么读取file2的人将看不到前37行。

