从固定宽度字符串

时间:2018-02-12 07:06:26

标签: awk

如何从固定宽度字符串中提取文本?

例如这是docker history命令的输出...

# docker history mysql
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
f008d8ff927d        3 weeks ago         /bin/sh -c #(nop)  CMD ["mysqld"]               0B
<missing>           3 weeks ago         /bin/sh -c #(nop)  EXPOSE 3306/tcp              0B
<missing>           3 weeks ago         /bin/sh -c #(nop)  ENTRYPOINT ["docker-ent...   0B
<missing>           3 weeks ago         /bin/sh -c ln -s usr/local/bin/docker-entr...   34B
<missing>           3 weeks ago         /bin/sh -c #(nop) COPY file:52f06a5715711e...   6.04kB
<missing>           3 weeks ago         /bin/sh -c #(nop)  VOLUME [/var/lib/mysql]      0B
<missing>           3 weeks ago         /bin/sh -c {   echo mysql-community-server...   242MB
<missing>           3 weeks ago         /bin/sh -c echo "deb http://repo.mysql.com...   55B
<missing>           3 weeks ago         /bin/sh -c #(nop)  ENV MYSQL_VERSION=5.7.2...   0B
<missing>           2 months ago        /bin/sh -c #(nop)  ENV MYSQL_MAJOR=5.7          0B
<missing>           2 months ago        /bin/sh -c set -ex;  key='A4A9406876FCBD3C...   21.8kB
<missing>           2 months ago        /bin/sh -c apt-get update && apt-get insta...   38.6MB
<missing>           2 months ago        /bin/sh -c mkdir /docker-entrypoint-initdb.d    0B
<missing>           2 months ago        /bin/sh -c set -x  && apt-get update && ap...   4.52MB
<missing>           2 months ago        /bin/sh -c #(nop)  ENV GOSU_VERSION=1.7         0B
<missing>           2 months ago        /bin/sh -c groupadd -r mysql && useradd -r...   330kB
<missing>           2 months ago        /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>           2 months ago        /bin/sh -c #(nop) ADD file:1dd78a123212328...   123MB

我只需要选择第三列&#34;由&#34;创建。这就是我试过的......

# docker history mysql | awk '{printf "%-20s  %-20s %-48s %-2s \n", $1,$2,$3,$4}'
IMAGE                 CREATED              CREATED                                          BY
f008d8ff927d          3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             3                    weeks                                            ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago
<missing>             2                    months                                           ago

我猜这里的空间不是字段分隔符。我需要从第40位到第88位的角色。

2 个答案:

答案 0 :(得分:2)

print char 40 to char 88

   sed "s:^.\{40\}\(.\{48\}\).*:\1:" file

1) ^.\{40\} - 字符串开头的40个字符(. - char,{number of repeats}

2) \(.\{48\}\) - 另外48个字符(40到88个字符)。

3) .* - 字符串的其余部分

4) \(\)\1 - 用48个字符替换字符串(从40到88)

答案 1 :(得分:1)

如果您的历史记录总是相同,则以下内容可能对您有所帮助:

解决方案第一: 由于OP提到的输出宽度是固定的,所以在这里使用该策略:

awk 'substr($0,41,48){print substr($0,41,48)}'  Input_file

解决方案第二: 在此使用match awk实用程序:

Your_command | awk '{match($0,/\/bin[^[:alnum:]].*/);num=split(substr($0,RSTART,RLENGTH),array," ");for(i=1;i<=(num-1);i++){printf("%s%s",array[i],i==(num-1)?ORS:OFS)}}' 

现在添加一种非单线形式的解决方案:

your_command | awk '
{
  match($0,/\/bin[^[:alnum:]].*/);
  num=split(substr($0,RSTART,RLENGTH),array," ");
  for(i=1;i<=(num-1);i++){
    printf("%s%s",array[i],i==(num-1)?ORS:OFS)}
}
'   Input_file

输出如下:

/bin/sh -c #(nop) CMD ["mysqld"]
/bin/sh -c #(nop) EXPOSE 3306/tcp
/bin/sh -c #(nop) ENTRYPOINT ["docker-ent...
/bin/sh -c ln -s usr/local/bin/docker-entr...
/bin/sh -c #(nop) COPY file:52f06a5715711e...
/bin/sh -c #(nop) VOLUME [/var/lib/mysql]
/bin/sh -c { echo mysql-community-server...
/bin/sh -c echo "deb http://repo.mysql.com...
/bin/sh -c #(nop) ENV MYSQL_VERSION=5.7.2...
/bin/sh -c #(nop) ENV MYSQL_MAJOR=5.7
/bin/sh -c set -ex; key='A4A9406876FCBD3C...
/bin/sh -c apt-get update && apt-get insta...
/bin/sh -c mkdir /docker-entrypoint-initdb.d
/bin/sh -c set -x && apt-get update && ap...
/bin/sh -c #(nop) ENV GOSU_VERSION=1.7
/bin/sh -c groupadd -r mysql && useradd -r...
/bin/sh -c #(nop) CMD ["bash"]
/bin/sh -c #(nop) ADD file:1dd78a123212328...