正则表达式没有返回任何结果

时间:2012-01-13 17:04:23

标签: regex boost

我有一些关于boost :: regex的问题:我在下面尝试了一个例子。

1)sregex_token_iterator的第4个参数是什么?它听起来像是一个“默认匹配”,但你为什么要这样做而不是只返回什么呢?我在没有第4个参数的情况下尝试了它,但它没有编译。

2)我得到了输出: (1,0) (0,0) (3,0) (0,0) (5,0)

任何人都可以解释出现了什么问题吗?

#include <iostream>
#include <sstream>
#include <vector>
#include <boost/regex.hpp>

// This example extracts X and Y from ( X , Y ), (X,Y), (X, Y), etc.


struct Point
{
   int X;
   int Y;
   Point(int x, int y): X(x), Y(y){}
};

typedef std::vector<Point> Polygon;

int main()
{
  Polygon poly;
  std::string s = "Polygon: (1.1,2.2), (3, 4), (5,6)";

  std::string floatRegEx = "[0-9]*\\.?[0-9]*"; // zero or more numerical characters as you want, then an optional '.', then zero or more numerical characters.
  // The \\. is for \. because the first \ is the c++ escpape character and the second \ is the regex escape character
  //const boost::regex r("(\\d+),(\\d+)");
  const boost::regex r("(\\s*" + floatRegEx + "\\s*,\\s*" + floatRegEx + "\\s*)");
  // \s is white space. We want this to allow (2,3) as well as (2, 3) or ( 2 , 3 ) etc.

  const boost::sregex_token_iterator end;
  std::vector<int> v; // This type has nothing to do with the type of objects you will be extracting
  v.push_back(1);
  v.push_back(2);

  for (boost::sregex_token_iterator i(s.begin(), s.end(), r, v); i != end;)
  {
    std::stringstream ssX;
    ssX << (*i).str();
    float x;
    ssX >> x;
    ++i;

    std::stringstream ssY;
    ssY << (*i).str();
    float y;
    ssY >> y;
    ++i;

    poly.push_back(Point(x, y));
  }

  for(size_t i = 0; i < poly.size(); ++i)
  {
    std::cout << "(" << poly[i].X << ", " << poly[i].Y << ")" << std::endl;
  }
  std::cout << std::endl;

  return 0;
}

1 个答案:

答案 0 :(得分:0)

你的正则表达式是完全可选的:

"[0-9]*\\.?[0-9]*"

也匹配空字符串。所以"(\\s*" + floatRegEx + "\\s*,\\s*" + floatRegEx + "\\s*)"也匹配一个逗号。

你应该至少强制要求:

"(?:[0-9]+(?:\\.[0-9]*)?|\\.[0-9]+)"

这允许11.11..1但不允许.

(?:          # Either match...
 [0-9]+      # one or more digits, then
 (?:         # try to match...
  \.         #  a dot
  [0-9]*     #  and optional digits
 )?          # optionally.
|            # Or match...
 \.[0-9]+    # a dot and one or more digits.
)            # End of alternation