Question

我有多个文件，只有一行简单文字。我想删除每个文件中每个单词的最后一个字符。每个文件都有不同的文本长度。

我最接近的是编辑一个文件：

awk '{ print substr($1, 1, length($1)-1); print substr($2, 1, length($2)-1); }' file.txt

但我无法弄清楚，对于具有不同单词数量的文件，如何使这个更通用。

Answer 1

awk '{for(x=1;x<=NF;x++)sub(/.$/,"",$x)}7' file

这应该删除。

如果测试正常，并且您想要覆盖您的文件，则可以执行以下操作：

awk '{for(x=1;x<=NF;x++)sub(/.$/,"",$x)}7' file > tmp && mv tmp file

示例：

kent$  awk '{for(x=1;x<=NF;x++)sub(/.$/,"",$x)}7' <<<"foo bar foobar"   
fo ba fooba

Answer 2

使用awk循环直到每行中的最大字段为NF，然后应用substr函数。

awk '{for (i=1; i<=NF; i++) {printf "%s ", substr($i, 1, length($i)-1)}}END{printf "\n"}' file

对于示例输入file

ABCD ABC BC

awk逻辑产生输出

ABC AB B

将记录分隔符更改为NULL并仅使用print的另一种方法： -

awk 'BEGIN{ORS="";}{for (i=1; i<=NF; i++) {print substr($i, 1, length($i)-1); print " "}}END{print "\n"}' file

Answer 3

我会采用Bash方法：

自${var%?} removes the last character of a variable：

CommandStrategy strategy = new CommandStrategy();

List<CommandBase> commands = new List<CommandBase>(){
                  new Command1(), new Command2(), new Command3() };

foreach (var item in commands)
{
   CommandTypes type = (CommandTypes)item;
   strategy.Execute(type);
}

you can use the same approach on arrays：

$ var="hello"
$ echo "${var%?}"
hell

如何浏览文件，将其唯一的行（您说文件只包含一行）读入数组并使用上述工具删除每个单词的最后一个字符：

$ arr=("hello" "how" "are" "you")
$ printf "%s\n" "${arr[@]%?}"
hell
ho
ar
yo

Answer 4

Sed版本，假设单词仅由字母组成（如果没有，只需调整类[[:alpha:]]以反映您的需要）并按空格和标点分隔

sed 's/$/ /;s/[[:alpha:]]\([[:blank:][:punct:]]\)/\1/g;s/ $//' YourFile

awk（实际上是正则表达式边界的gawk）

 gawk '{gsub(/.\>/, "");print}' YourFile

 #or optimized by @kent ;-) thks for the tips
 gawk '4+gsub(/.\>/, "")' YourFile

Answer 5

$ cat foo
word1
word2 word3
$ sed 's/\([^ ]*\)[^ ]\( \|$\)/\1\2/g' foo
word
word word

字是除空格（= [^ ]）之外的任何字符串。

编辑：如果您要强制执行POSIX（--posix），可以使用：

$ sed --posix 's/\([^ ]*\)[^ ]\([ ]\{,1\}\)/\1\2/g' foo
word
word word

此$ \|$$更改为$[ ]\{,1\}$，即最后有一个可选空格。

删除文件中每个单词的最后一个字符

5 个答案: