AS3 RegExp问题

时间:2011-03-26 15:05:02

标签: regex actionscript-3

我刚刚编写了一个Flex应用程序,它将一些维基百科文本内容作为字符串处理。 我正在尝试使用RegExp来清除所有维基百科标记。这是一个例子:

我想这样:

var pageText:String = new String("was an [[People of the United States|American]] [[film director]], writer, [[Film producer|producer]], and [[photographer]] who lived in England during most of the last four decades of his career. Kubrick was noted for the scrupulous care with which he chose his subjects, his slow method of working, the variety of genres he worked in, his technical perfectionism, and his reclusiveness about his films and personal life. He maintained almost complete artistic control, making movies according to his own whims and time constraints, but with the rare advantage of big-[[Movie studio|studio]] [[financial support]] for all his endeavors.");

看起来像这样:

var pageText:String = new String("was an American film director, writer, producer, and photographer who lived in England during most of the last four decades of his career. Kubrick was noted for the scrupulous care with which he chose his subjects, his slow method of working, the variety of genres he worked in, his technical perfectionism, and his reclusiveness about his films and personal life. He maintained almost complete artistic control, making movies according to his own whims and time constraints, but with the rare advantage of big-studio financial support for all his endeavors.");

所以我需要编写一个RegExp [[删除这部分|但保留这个]]。

我测试了其中的这些:

           var pattern:RegExp = new RegExp(/\[\[(.+)\|/);
           var pattern2:RegExp = new regExp(/^\[\[\|/);
           var pattern3:RegExp = new RegExp(/^\[\[[A-Z].*\|$/);

           var pageTextCleaned:String = pageText.replace(pattern, " ");

然后删除剩余的[[和]]

会很容易

我完全没有使用这个RegExp的东西,所以任何帮助都会很棒!

谢谢!

3 个答案:

答案 0 :(得分:4)

您正在使用RegExp构造函数,该构造函数将字符串作为其参数,但为其提供RegExp。我认为这不符合您的要求 看看它是否适用于词法RegExp:

var pageTextCleaned:String = pageText.replace(/\[\[([^\]]*\|)?([^\]]+)]]/g, "$2");

如果您在]内有|个或多个[[...]] s,这不是很强大,但这只是一个开始。

答案 1 :(得分:0)

我不知道AS3,但这是一个实现它的JavaScript代码,应该是类似的:

s = s.replace(/\[\[(?:([^\]|]*)|[^\]|]*\|([^\]]*))\]\]/g, '$1$2');

正则表达式令人困惑。以下是它的细分:

  • \[\[ - 两个方括号。
  • (?: | ) - 非捕获组有两个选项:

    • ([^\]|]*) - 内容不包含竖线字符,将整个内容捕获到第一组$1
    • [^\]|]*\|([^\]]*) - 与竖线字符的链接:
      • [^\]|]* - 某些字符不是]|
      • \| - 文字管标志。
      • ([^\]]*) - 更多非]字符,捕获到第二组$2
  • \[\[ - 两个方括号。

然后我们用$1$2替换每个捕获 - 其中一个总是空的,另一个是我们想要保留的字符串。

工作示例:http://jsbin.com/adedu4

答案 2 :(得分:0)

由于我不确定条目的最大数量是否> 2,所以这是一个循环的解决方案,替换以“|”结尾的每个条目用“[[”直到没有剩下,然后删除“[[”和“]]”。如果总有两个,你可以简化一点来加快速度:

 var entryPattern:RegExp = new RegExp(/\[\[\w+\|/);
 var bracketPattern:RegExp = new regExp(/[\[\[|\]\]]/);

 var pageText:String = "your text";
 var replacedText:String = "";

 while( pageText != replacedText ) {
    if( replacedText != "" ){  pageText = replacedText; }
    replacedText = pageText.replace(entryPattern, "[[");
 }

 replacedText = "";
 while( pageText != replacedText ) {
    if( replacedText != "" ){  pageText = replacedText; }
    replacedText = pageText.replace(bracketPattern, "");
 }

您可能希望将替换循环放入您自己的实用程序“replaceAll”函数中,因为它随处可见。