Question

如何在两个文件之间只获得差异字母？

例如，

文件1：

aaa;bbb;ccc
123;456;789
a1a;b1b;c1c

file2的：

aAa;bbb;ccc
123;406;789
a1a;b1b;c5c

差异之后，我应该只从第二个文件获得这一串差异：A05

Answer 1

diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2) |
sed 's/.*\(.\)$/\1/' | paste -s -d '' -

这会使用fold替换流程，将每个文件转换为一个字符宽的字符列，然后将其与diff进行比较。

-y选项打印彼此相邻的行，--suppress-common-lines跳过两个文件之间相同的行。在此之前，输出如下所示：

$ diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2)
a                                 | A
5                                 | 0
1                                 | 5

我们只对每一行的最后一个字符感兴趣。我们使用sed丢弃其余部分：

$ diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2) |
> sed 's/.*\(.\)$/\1/'
A
0
5

为了将这些变为一行，我们使用paste选项（serial）和空字符串作为分隔符（-s）来管道-d ''。短划线告诉paste从标准中读取。

$ diff -y --suppress-common-lines <(fold -w 1 file1) <(fold -w 1 file2) |
> sed 's/.*\(.\)$/\1/' | paste -s -d '' -
A05

另一种选择，如果你有GNU diffutils，你可以cmp：

$ cmp -lb file1 file2 | awk '{print $5}' | tr -d '\n'
A05

cmp逐字节比较文件。 -l选项（“详细”）使其打印所有差异，而不仅仅是第一个; -b选项使其添加不同字节的ASCII解释：

$ cmp -lb file1 file2
 2 141 a    101 A
18  65 5     60 0
34  61 1     65 5

awk命令将此输出减少到第五列，tr删除换行符。

Answer 2

对于给出的例子，您可以逐个字符地比较文件，如果存在差异，则打印第二个文件的字符。这是一种方法：

paste <(fold -w1 file1) <(fold -w1 file2) | \
while read c1 c2; do [[ $c1 = $c2 ]] || printf $c2; done

对于给定的示例，这将打印A05。

如何显示两个文件之间的合并差异？

2 个答案: