Question

我试图找到以特定字符开头的单词，如：

Lorem ipsum #text Second lorem ipsum。你好吗。没关系。完成。现在#else。

我需要以“＃”开头所有单词。所以我的预期结果是＃text，#are，＃else

有什么想法吗？

Answer 1

搜索：

那些不是单词字符的东西
＃
一些单词字符

所以试试这个：

/(?<!\w)#\w+/

或者在C＃中它看起来像这样：

string s = "Lorem ipsum #text Second lorem ipsum. How #are You. It's ok. Done. Something #else now.";
foreach (Match match in Regex.Matches(s, @"(?<!\w)#\w+"))
{
    Console.WriteLine(match.Value);
}

输出：

#text
#are
#else

Answer 2

试试这个#(\S+)\s?

Answer 3

在空格或行首之后匹配以＃开头的单词。根据您的使用情况，不需要最后一个单词边界。

/(?:^|\s)\#(\w+)\b/

括号将在一个组中捕获您的单词。现在，这取决于你如何应用这个正则表达式的语言。

(?:...)是非捕获组。

Answer 4

为适应不同的语言，我有以下语言（PCRE / PHP）：

'~(?<!\p{Latin})#(\p{Latin}+)~u'

或

$language = 'ex. get form value';
'~(?<!\p{' . $language . '})#(\p{' . $language . '}+)~u'

或循环浏览多个脚本

$languages = $languageArray;

$replacePattern = [];

foreach ($languages as $language) {

  $replacePattern[] = '~(?<!\p{' . $language . '})#(\p{' . $language . '}+)~u';

}

$replacement = '<html>$1</html>';

$replaceText = preg_replace($replacePattern, $replacement, $text);

\w的效果很好，但据我所知仅适用于拉丁文字。

在以上示例中，将Latin或Cyrillic的{{1}}切换为Phoenician。

上面的示例不适用于“ RTL”脚本。

Answer 5

下面的代码应该可以解决这种情况。

/\$(\w)+/g搜索以$开头的单词
/#(\w)+/g搜索以#开头的单词

Mark Bayers给出的答案/(?<!\w)#\w+/在RegExr.com网站上发出如下警告

"(?<!" The "negative lookbehind" feature may not be supported in all browsers.

可以通过删除(?!\w)@\w+来将警告更改为>来解决警告

正则表达式查找以特定字符开头的单词

5 个答案: