计算字符的出现次数

时间:2018-02-14 14:19:18

标签: linux shell command-line

我有一个.txt文件

ID Number        Name                         Fed Sex Tit  Wtit
4564             A B M Yusop, Tapan           BAN M
59841212         A Rafiq                      IND F   WFM  WFM
19892            Aadel F , Arvin              IND M 
.
.
.

我必须在linux命令行中计算这个文件中有多少女性F和男性M. 我是linux shell的新手,所以我只考虑grep命令,但“名称”中也可以有“M”和“F”。

有什么建议吗?

3 个答案:

答案 0 :(得分:0)

我会用awk来做这个(找到列,然后计算):

$ awk '
# first line
NR == 1 { 
    if (col = index($0, "Sex")) {
        next # skip rest of script for this line
    }

    print "Could not find the required header"
    exit
} 

# all lines
{ 
    # increment counts of each `M` or `F`
    ++count[substr($0, col, 1)]
} 

END { 
    # loop through count array and print
    for (i in count) print i, count[i] 
}' file

答案 1 :(得分:-1)

首先使用cut来获取一列。类似的东西:

cut -c40 < file.txt # gets the 40th character on each line

然后计算不同的值:

cut -c40 < file.txt | sort | uniq -c

答案 2 :(得分:-1)

在使用GNU grep的bash中,你可以写:

IFS= read -r header < file          # read the first line of the file
prefix=${header%%Sex *}             # remove "Sex " and everthing after it
skip_regex=${prefix//?/.}           # replace all chars with "."

# then find the letters and count them
grep -oP "^$skip_regex\\K[MF]" file | sort | uniq -c

输出

  1 F
  2 M