用正则表达式在大写单词之间插入空格

时间:2014-04-25 14:23:10

标签: regex

我想在单词中的字符之间插入空格,但仅限于包含至少2个大写字符的单词。我可以使用正则表达式。

例如:“这是一个简单的SEnTeNCE,有几个字。” - > “这是简单的S E n T e N C E,带有F E W字。”

2 个答案:

答案 0 :(得分:4)

使用PHP / PCRE的方法:

$pattern = '~(?:\b(?=(?:\w*[A-Z]){2})|(?!^)\G)\w\B\K~';

$text = preg_replace($pattern, ' ', $text);

模式细节:

(?:                      # non capturing group: begin with:
    \b                   # a word boundary 
    (?=(?:\w*[A-Z]){2})  # followed by a word with two uppercase letter at least
  |                      # OR
    (?!^)\G              # anchor: end of last match
)
\w\B                     # a word character followed by an other word character
\K                       # reset the match from match result

Javascript带回调的方法:

var str = "This is simple SEnTeNCE with a FEW word.";

var res = str.replace(/\b(?:[a-z]*[A-Z]){2,}[a-z]*\b/g, function (m) {
    return  m.split('').join(' '); } );

console.log(res);

答案 1 :(得分:1)

一个正则表达式解决方案是(PCRE):

(?|(?=\b(?:[a-z]*[A-Z]){2})(\w)|(?!^)\G(\w))(?!\b)

(?|                             # branch reset group
  (?= \b (?:[a-z]* [A-Z]){2} )  # look ahead anchored at the begining of the word:
                                # check we are the beginning of a two-upper word
  (\w)                          # grab the first letter
|                               # OR
  (?!^)\G                       # we're following a previous match (and not
                                # at the beginning of the string)
  (\w)                          # if so we're inside a wanted word, so we grab
                                # a character
  (?!\b)                        # except if it's the last one (we don't want
                                # too many spaces)
)

并替换为

\1 # <- there's a space after the \1

请参阅demo here

请注意,在更多步骤中执行此操作可能更容易(抓住单词,单独处理,加入所有内容)...