Question

我一直在使用这个正则表达式

/(?:[^ .,;:]+[ .,;:]+){3}(?:term1|term2)(?:[ .,;:]+[^ .,;:]+){3}/gi

提取所选术语以及前后3个单词。我想更改正则表达式，以便我提取包含所选术语的行。该行将以\ n为界限但我也想修剪前导和尾随空格如何更改正则表达式呢？

示例输入：

   This line, containing  term2, I'd like to extract.  
        This line contains term13 and I'd like to ignore it  
  This line, on the other hand, contains term1, so let's keep it.

输出将是

This line, containing  term2, I'd like to extract.
This line, on the other hand, contains term1, so let's keep it.

请参阅下面要更改的代码。

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Untitled Document</title>
</head>

<body>
<script>
var Input = "   This line, containing  term2, I'd like to extract."
Input += "        This line contains term13 and I'd like to ignore it."
Input += "  This line, on the other hand, contains term1, so let's keep it."

 var matches = Input.match(/(?:[^ .,;:]+[ .,;:]+){3}(?:term1|term2)(?:[ .,;:]+[^ .,;:]+){3}/gi);
 var myMatches = ""
  for (i=0;i<matches.length;i++)
  {
  myMatches += ("..." + matches[i] + "...\n"); //assign to variable
  }
  alert(myMatches)
</script>


</body>
</html>

Answer 1

与Asad指出的一样，你可以使用\ b作为单词边界，例如term1与term13不匹配。

正则表达式：

^ *(.*\b(?:term1|term2)\b.*) *$

应该做你想要的事。您的匹配将位于第一个（也是唯一一个）捕获组中。只需循环遍历它们即可完成。

See it on rubular.

正则表达式提取包含术语的行

1 个答案: