Question

我有一个像这样的csv文件：

2018-May-17 21:33:16,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
2018-May-17 21:34:15,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
2018-May-17 21:35:17,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19

我只需将第一列转换为YYYYMMDDHHmmss格式，如下所示：

20180517213316,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213415,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213517,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19

如何在不修改其他列的情况下使用sed实现此目的？

Answer 1

$ awk -F'[- :,]' '{
    t = $1 sprintf("%02d",(index("JanFebMarAprMayJunJulAugSepOctNovDec",$2)+2)/3) $3 $4 $5 $6
    sub(/[^,]+/,t)
}1' file
20180517213316,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213415,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213517,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19

Answer 2

有两种方法可以进行更换。但这两种方式都需要帮助shell脚本。

PHP版

sed -r 's/([^,]*),(.*)/echo $(echo "\1"|.\/php.sh),\2/e' file

php.sh

#!/bin/sh

read str
php -r "echo date('YmdHis', strtotime('$str'));"

bash版本

sed -r 's/([^-]*)-([^-]*)-([0-9]{1,2})[[:space:]]*([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),(.*)/echo \1$(echo "\2"\|.\/help.sh)\3\4\5\6,\7/e' file

help.sh

#!/bin/sh

read str

case $str in
    Jan) MON=01 ;;
    Feb) MON=02 ;;
    Mar) MON=03 ;;
    Apr) MON=04 ;;
    May) MON=05 ;;
    Jun) MON=06 ;;
    Jul) MON=07 ;;
    Aug) MON=08 ;;
    Sep) MON=09 ;;
    Oct) MON=10 ;;
    Nov) MON=11 ;;
    Dec) MON=12 ;;
esac

echo $MON

输出：

20180517213316,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213415,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213517,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19

有关使用echo中嵌入的sed的详细信息，您可以this link

Answer 3

关注awk可能对您有帮助。

awk -F"," '
BEGIN{
   num=split("jan,feb,mar,apr,may,jun,jul,aug,sept,oct,nov,dec",array,",");
   for(i=1;i<=num;i++){
      month[array[i]]=sprintf("%02d",i)}
}
{
   split($1,a,"[- ]");
   a[2]=month[tolower(a[2])];
   $1=a[1] a[2] a[4];
   gsub(/:/,"",$1)
}
1' OFS=","   Input_file

代码说明：

awk -F"," '                                                                ##Setting field separator as comma here or lines.
BEGIN{                                                                     ##Starting BEGIN section for awk here.
   num=split("jan,feb,mar,apr,may,jun,jul,aug,sept,oct,nov,dec",array,",");##Using split to create a month names array and its length is stored in num variable.
   for(i=1;i<=num;i++){                                                    ##Starting a for loop from variable value i=1 to till value of num here.
      month[array[i]]=sprintf("%02d",i)}                                   ##Creating an array month whose index is array value with index i and value is variable i.
}
{                                                                          ##Starting main section here which will be executed during Input_file reading by awk.
   split($1,a,"[- ]");                                                     ##Using split to split $1 into array a whose delimiter are space and - in that line.
   a[2]=month[tolower(a[2])];                                              ##Setting 2nd value of array a to value of month array, to get months into digit format.
   $1=a[1] a[2] a[4];                                                      ##Re-creating first field with values of first, second and third values of array a.
   gsub(/:/,"",$1)                                                         ##globally substituting colon with NULL in first colon.
}
1                                                                          ##Using 1 here to print the current line.
' OFS="," Input_file                                                    ##Setting output field separator as comma and mentioning Input_file name here.

Answer 4

awk -F, '{ gsub(/:| /, "", $1); 
    x=(match("JanFebMarAprMayJunJulAugSepOctNovDec", substr($1,6,3))+2)/3;
    x=x>9?x:0x; gsub(/-.*-/, x, $1) }1' OFS=, infile

输出：

20180517213316,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213415,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19
20180517213517,VF-AUDI-prod,Start:2018-May-17:End:2018-May-19

如何运作

此-F,定义分隔字段的分隔符。
此gsub(/:| /, "", $1)从第一个字段中删除空格和冒号。
此substr($1,6,3)从第一个字段返回 月份名称
此match("JanFebMarAprMayJunJulAugSepOctNovDec", substr($1,6,3))返回 月份名称 的第一个字符位置（索引），以月份名称 {的字符串开头{1}} = 13 。这个JanFebMarAprMayJunJulAugSepOctNovDec的结果总是其中一个 1,4,7,10,13,16,19,22,25,28,31,34 ;现在我们得到 13 ，因为每个月份名称的长度为 3 ，我们应该找到一种方法如何返回 5 在结果中，我们将 2 添加到结果中以将位置指向匹配的月份名称的末尾，然后分成 3 {{1} }。
如果 match(...) 小于 10

13+2/3=5

x=x>9?x:0x

此0将月份名称中的连字符之间的匹配替换为仅在第一个字段中的值gsub(/-.*-/, x, $1)。
此x始终为真，导致打印 awk 行
此1将 O 输出 F eild S eperator 设置回逗号 OFS=,。

Answer 5

Sed one-liner：

$ cat file.csv | sed 's/^\([[:digit:]]*\)-\([^ ]*\)\(.*\)/\2-\1\3/g' | sed 's/\([^,]*\),\(.*\)/echo $(date -d "\1" +%Y%m%d%H%M%S ),\2/e'

解释

将％Y-％m-％d转换为％m-％d-％Y格式，以便日期消耗-d
使用sed仅替换第一列。
使用date＆＃39; s -d命令读取日期输入。
使用日期+％Y％m％d％H％M％S打印输出

Answer 6

这可能适合你（GNU sed）：

l="Jan01Feb02Mar03Apr04May05Jun06Jul07Aug08Sep09Oct10Nov11Dec12"
sed -r 's/$/\n'"$l"'/;s/^(....)-(...)-(..) (..):(..):(.*)\n.*\2(..).*/\1\7\3\4\5\6/' file

在每行的末尾附加一个查找表，并使用模式匹配，分组和反向引用，将第一列转换为所需的规范。

使用sed查找并替换特定csv列中的多个模式

6 个答案:

如何运作