Question

假设您要使用以下两种方法之一从数据文件中删除注释：

cat file.dat | sed -e "s/\#.*//"
cat file.dat | grep -v "#"

这些单独的方法如何工作，它们之间有什么区别？一个人是否有可能将干净的数据写入新文件，同时避免任何可能的警告或错误消息最终出现在该数据文件中？如果是这样，您将如何进行？

Answer 1

这些单独的方法如何工作，有什么区别他们之间？

是的，尽管sed和grep是2个不同的命令，它们的工作原理相同。您的sed命令只需将所有带有#的行替换为NULL。另一方面，grep只会跳过或忽略那些将跳过其中带有#的行的行。

您可以通过手册页获取有关这些的更多信息，如下所示：

man grep：

   -v, --invert-match
          Invert the sense of matching, to select non-matching lines.  (-v is specified by POSIX.)

man sed：

   s/regexp/replacement/
          Attempt to match regexp against the pattern space.  If successful, replace that portion matched with replacement.   The 
替换可能包含特殊字符＆表示匹配的模式空间部分，特殊转义\ 1 通过\ 9到请参阅正则表达式中相应的匹配子表达式。

一个人是否有可能将干净的数据写入新文件，同时避免任何可能的警告或错误消息最终进入该数据文件？

是的，我们可以在两个命令中都使用2>/dev/null来重定向错误。

如果是这样，您将如何进行？

您可以尝试像2>/dev/null 1>output_file

sed命令的解释： 现在也添加了sed命令的说明。这仅出于理解目的，不需要先使用cat，然后再使用sed，而可以使用sed -e "s/\#.*//" Input_file。

sed -e "  ##Initiating sed command here with adding the script to the commands to be executed
s/        ##using s for substitution of regexp following it.
\#.*      ##telling sed to match a line if it has # till everything here.
//"       ##If match found for above regexp then substitute it with NULL.

Answer 2

该grep -v将丢失所有上面带有#的行，例如：

$ cat file
first
# second
thi # rd

如此

$ grep -v "#" file
first

将删除所有带有#的行，这是不利的。相反，您应该：

$ grep -o "^[^#]*" file
first
thi

就像sed命令一样，但是这样您就不会出现空行。 man grep：

   -o, --only-matching
          Print  only  the  matched  (non-empty) parts of a matching line,
          with each such part on a separate output line.

从数据文件中删除注释。有什么区别？

2 个答案: