如何从文本文件中删除部分重复项?

时间:2017-12-30 04:53:00

标签: bash awk sed grep

如何使用awk,grep或sort删除bash中的部分重复项?

输入:

"3","6"
"3","7"
"4","9"
"5","6"
"26","48"
"543","7"

预期产出:

"3","6"
"3","7"
"4","9"
"26","48"

1 个答案:

答案 0 :(得分:2)

请您试着跟随并告诉我这是否对您有帮助。

awk -F'[",]' '!a[$5]++'   Input_file

输出如下。

"3","6"
"3","7"
"4","9"
"26","48"

编辑: 此处也添加说明。

awk -F'[",]' '   ##Setting field separator as " or , for every line of Input_file.
!a[$5]++         ##creating an array named a whose index is $5(fifth field) and checking condition if 5th field is NOT present in array a, so when any 5th field comes in array a then increasing its count so next time it will not take any duplicates in it. Since awk works on condition and then action, since here no action is mentioned so by default print of current line will happen.
' Input_file     ##Mentioning the Input_file here too.
相关问题