Question

我是bash的新手并且具有以下要求：

我有一个文件如下：

col1,col2,col3....col25
s1,s2,s2..........s1
col1,col2,col3....col25
s3,s2,s2..........s2

如果您注意到这些列的值只能是3种类型：s1，s2，s3

我可以从给定文件中提取最后两行，它给出了：

col1,col2,col3....col25
s3,s1,s2..........s2

我想进一步解析上面的行，这样我只得到具有值s1的列。

期望的输出： 比如col3，col25是唯一值为s2的列，那么说逗号分隔值也很好ex：

col3,col25

有人可以帮忙吗？

P.S。我找到了很多基于say第二（固定）列的值解析文件的例子，但是当列号没有修复时我们怎么办？已检查的网址： awk one liner select only rows based on value of a column

Answer 1

假设：

有2条输入线
每个输入行具有相同数量的逗号分隔项

我们可以使用几个数组来收集输入数据，确保使用相同的数组索引。将数据加载到数组后，我们遍历数组寻找我们的值匹配。

$ cat col.awk
  /col1/ { for (i=1; i<=NF; i++) { arr_c[i]=$i } ; n=NF }
! /col1/ { for (i=1; i<=NF; i++) { arr_s[i]=$i }        }
END {
sep=""
for (i=1; i<=n; i++)
    { if (arr_s[i]==smatch)
         { printf "%s%s" ,sep,arr_c[i]
           sep=", "
         }
    }
}

/col1/：对于包含col1的行，请将字段存储在数组arr_c
n=NF：获取我们的最大数组索引值（NF =字段数）
! /col1/：对于不包含col1的行，请将字段存储在数组arr_s
END ...：加载数组后执行
sep=""：将我们的初始输出分隔符设置为空字符串
for (...)：遍历我们的数组索引（1到n）
if (arr_s[i]==smatch)：如果s数组值与我们的输入参数匹配（smatch - 见下面的例子），那么......
printf "%s%s",sep,arr_c[i]：printf我们的sep和匹配的c数组项目，然后......
sep=", "：为循环中的下一个匹配设置分隔符

我们使用printf因为没有指定'\ n'（新行），所有输出都转到一行。

示例：

$ cat col.out
col1,col2,col3,col4,col5
s3,s1,s2,s1,s3
$ awk -F, -f col.awk smatch=s1 col.out                                                                                           
col2, col4

-F,：将输入字段分隔符定义为逗号
在这里，我们在名为s1的数组变量中传递搜索模式smatch，该变量在awk代码中引用（参见上面的col.awk）

如果你想在命令行做所有事情：

$ awk -F, '
  /col1/ { for (i=1; i<=NF; i++) { arr_c[i]=$i } ; n=NF }
! /col1/ { for (i=1; i<=NF; i++) { arr_s[i]=$i }        }
END {
sep=""
for (i=1; i<=n; i++)
    { if (arr_s[i]==smatch)
         { printf "%s%s" ,sep,arr_c[i]
           sep=", "
         }
    }
}
' smatch=s1 col.out
col2, col4

或将END块折叠为一行：

awk -F, '
  /col1/ { for (i=1; i<=NF; i++) { arr_c[i]=$i } ; n=NF }
! /col1/ { for (i=1; i<=NF; i++) { arr_s[i]=$i }        }
END { sep="" ; for (i=1; i<=n; i++) { if (arr_s[i]==smatch) { printf "%s%s" ,sep,arr_c[i] ; sep=", " } } }
' smatch=s1 col.out
col2, col4

Answer 2

我对awk不太满意，但这里似乎有用，只输出对应值为s1的列名：

#<yourTwoLines> | 
  tac | 
  awk -F ',' 'NR == 1 { for (f=1; f<=NF; f++) { relevant[f]= ($f == "s1") } };
              NR == 2 { for (f=1; f<=NF; f++) { if(relevant[f]) print($f) } }'

它的工作方式如下：

使用tac反转行顺序，因此在标题之前处理值（标准）（我们将根据标准打印）。
使用awk处理第一行（现在的值）时，将数据存储为s1
使用awk处理第二行（现在标题）时，请打印与s1值相对应的值，这要归功于之前填充的数组。

Answer 3

awk 中的解决方案，在解析每组2行后打印结果行。

#:kivy 1.10.0

BoxLayout:
    orientation: 'vertical'
    Label:
        id: mytext
        text: 'A'
        font_size: 30

    BoxLayout:
        height:150
        orientation: 'horizontal'
        padding: 20
        spacing: 30
        size_hint:(1,None)

        Button:
            id: mybutton
            text: 'CLICK'
            font_size:25
            on_press: app.TextChange()
        Button:
            id: mybutton2
            text: 'Back'
            font_size:25
            on_press: app.on_start()

基本原理：当您循环遍历值行时，建立结果字符串from kivy.app import App class TextApp(App): def on_start(self): self.root.ids.mytext.text = 'You can Change it' def TextChange(self): self.root.ids.mytext.text = 'Text Changes' if __name__ == '__main__': TextApp().run()。

只要您的输入包含s1，s2或s3，循环遍历元素，并且 - 如果$ cat tst.awk BEGIN {FS=","; p=0} /s1|s2|s3/ { for (i=1; i<NF; i++) { if ($i=="s2") str = sprintf("%s%s", str?str ", ":str, c[i]) }; p=1 } !p { for (i=1; i<NF; i++) { c[i] = $i } } p { print str; p=0; str="" } - ，则将索引为str的列添加到结果字符串value == s2;将print var i设置为1。
如果str构建列数组
如果p打印结果字符串p = 0

输入：

p = 1

结果是：

str

注意空的第3行：没有s2＆＃39; s。

Answer 4

让我们说你有这个：

Person

然后您可以使用此List<Person> collect = persons.stream() .collect(Collectors.groupingBy(person -> person.getFirstName() + "." + person.getLastName(), Collectors.summingInt(Person::getIncome))) .entrySet().stream().map(entry -> new Person(entry.getKey().split(".")[0], entry.getKey().split(".")[1], entry.getValue())) .collect(Collectors.toList());：

cat file
col1,col2,col3,..,col25
s3,s1,s2,........,s2

Answer 5

如果返回的列的顺序不是问题

awk -F"," 'NR==1{for(i=1;i<=NF;i++){a[i]=$i};next}{for(i=1;i<=NF;i++){if($i=="s2")b[i]=$i}}END{for( i in b) m=m a[i]",";  gsub(/,$/,"", m); print m }'

bash - 根据值选择列

5 个答案: