正则表达式帮助...单词内的撇号

时间:2011-11-08 21:33:33

标签: php regex

我正在尝试编写一个与字符串的前8个单词匹配的正则表达式(包括末尾的任何标点符号),但是当单词包含撇号或单引号字符时,我遇到了问题。我目前的正则表达式如下:

/(\b[\w,']+[.?'!\"]*\s){8}/

我的示例字符串是:

Went for Valentine's day, food was about a B, filet mignon was served chopped up

目前,我正在返回的比赛是:

s day, food was about a B, filet

但我希望它是这样的:

Went for Valentine's day, food was about a

我尝试将'插入到我的字符集[\w,']中,但它无法正常工作。任何帮助将不胜感激。

谢谢!

4 个答案:

答案 0 :(得分:3)

尽管可以使用正则表达式完成,但至少可以使用preg_split完成此操作:

$string="Went for Valentine's day, food was about a B, filet mignon was served chopped up";

$words=preg_split("/\s+/",$string);

#If there are more than eight words, only take the first eight elements of $words.
if(count($words)>8)
{
  $words=array_slice($words,0,8);
}

echo implode(" ",$words) . "\n";

这会产生以下输出:

Went for Valentine's day, food was about a

答案 1 :(得分:1)

这基本上将撇号计为单词字符:

\b(\w|')+\b

答案 2 :(得分:0)

如果你想在正则表达式中包含撇号,正确的方法是做这样的事情。

[\w']+

然后你可以根据需要使用单词边界。 \b

答案 3 :(得分:-1)

$text = "Knock, knock. Who's there? r2d2!";
$pattern = "/(?:\w'\w|\w)+/";
$words = preg_match_all($pattern, $text, $matches);
var_dump($matches);