Eliminating the duplicates from a specific column

时间:2016-08-31 16:48:38

标签: python bash awk gawk

Let say i have a file as below

MGW24:EXT-1:YES:OK
MGW24:SET-4:NO:OK
MGW24:SET-132:NO:OK

I want to check the column one and see if it repeats. In case if any string got repeats i want to eliminate the duplicates and print rest of the columns as it is. Please do remember the space between column1 and 2.

Expected Output:

MGW24:EXT-1:YES:OK
     :SET-4:NO:OK
     :SET-132:NO:OK

2 个答案:

答案 0 :(得分:1)

You can use this awk:

awk -F: 'seen[$1]++{c=length($1); $0=sprintf("%*s%s", c, "", substr($0, c+1))} 1' f file

MGW24:EXT-1:YES:OK
     :SET-4:NO:OK
     :SET-132:NO:OK

答案 1 :(得分:0)

awk -F":" -v OFS=":" '$1 in a{$1=a[$1]} !($1 in a){while(length(t) < length($1)) t=t" "; a[$1]=t} 1' file

Breakdown

$1 in a{$1=a[$1]} # if repeated set space to $1
!($1 in a){
 while(length(t) < length($1)) t=t" "; #create replacement text
 a[$1]=t #set it to array a
}
1 #print

Output

MGW24:EXT-1:YES:OK
     :SET-4:NO:OK
     :SET-132:NO:OK