用C ++选择性地读取格式化的数据文件

时间:2015-08-07 21:58:21

标签: c++ iterator ifstream getline stringstream

我有一个以这种方式开头的数据文件:

/*--------------------------------------------------------------------------*\
Some useless commented information
\*---------------------------------------------------------------------------*/

// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //


882
(
(0 0 0)
(1 1 1)
...more vectors
)

如何继续读取文件并将数字882以及矢量列表存储到数据结构中?

我基本上试图使用括号内的数据,即(1 2 3)到vec.x = 1,vec.y = 2,vec.z = 3.

这是我尝试至少打印出882的数字,它确实如此:

#include <iostream>
#include <fstream>
#include <string>
#include <sstream>
#include <vector>
#include <iterator>

int main()
{

  std::string line;
  std::ifstream file ("points");
  if (file.is_open())
  {
    while ( getline (file,line) )
    {
            std::stringstream ss(line);
            int n;
            std::vector<int> v;

                while (ss >> n)
                {
                    v.push_back(n);
                }
                std::copy(v.begin(), v.end(), std::ostream_iterator<int>(std::cout, " "));

    }
    file.close();
  }

  else std::cout << "Unable to open file";

  return 0;

}

1 个答案:

答案 0 :(得分:2)

具体取决于您的格式化数据文件可以包含哪些行。

假设您的数据文件遵循以下模式:

/* THIS IS BEGIN OF COMMENT BLOCK */
STILL MORE USELESS COMMENTS
812
that 812 is still useless
\* END OF COMMENT BLOCK *\

// **** Single line comment *** //

// **** its fine to have blank lines ***** //

812
(
(1 2 3)
// **** Comments can come anywhere **** //
(4 5 6)
.... MORE VECTORS ...
(7 8 9)
/***** EVENT BLOCK COMMENTS ****/
\***** ******\

// **** Blank lines allowed anywhere **** //
)

您可以设置一个简单的状态机来处理您的数据文件。

你将有几个州:

1. Looking for initial number
   a. Inside Comment Block
   b. Not inside Comment Block
2. Looking for start of list of vectors
   a. Inside Comment Block
   b. Not inside Comment Block
3. Reading list of vectors / Looking for end of list of vectors
   a. Inside Comment Block
   b. Not inside Comment block

你基本上有三件事你正在寻找。矢量列表的初始数量,开始和结束。 在每个中,您有两个基本情况会影响您处理线路的方式。你是否在块评论中。

如果您在阻止评论内部忽略了所有内容,直到找到阻止评论的结尾。

否则处理该行以确定它是否为空白行,评论栏的开头,评论行或您当前正在寻找的事物。

代码

#include <iostream>
#include <vector>
#include <fstream>
#include <string>
#include <sstream>
using namespace std;

struct vec{
  int x;
  int y;
  int z;
};

/* I'll leave these for you to try out yourself. You would know best how each of these are defined */
bool block_comment_start(const string& line);
bool block_comment_end(const string& line);
bool is_number(const string& line);
bool is_point(const string& line);
bool is_start_of_point_list(const string& line);
bool is_end_of_point_list(const string& line);
int parse_num(const string& line){
  int tmp;
  istringstream ss(line);
  ss >> tmp;
  return tmp;
}
vec parsePoint(const string& line){
  vec tmp;
  char lp; /* ignore left parenthesis at beginning of point*/
  istringstream ss(line);
  ss >> lp >> tmp.x >> tmp.y >> tmp.z;
  return tmp;
}

int main(){
    string line;
    int state(0);        /* we're initially looking for a number */
    bool comment(false); /* We're initially not inside a comment block */

    int val;
    vector<vec> points;

    ifstream file("points");
    if (file.is_open()){
      while (getline(file, line)){
        if (comment){
          if (block_comment_end(line))
            comment = false;
        } else if (state == 0){ // Looking for initial number
          if (block_comment_start(line))
            comment = true;
          else if (is_number(line)){
            val = parse_num(line);
            ++state;
          } /* ignore anything that isn't a number or begin of comment line */
        } else if (state == 1){
          if (block_comment_start(line))
            comment = true;
          else if (is_start_of_point_list(line)){
            ++state;
          }
        } else if (state == 2){
          if (block_comment_start(line))
            comment = true;
          else if (is_end_of_point_list(line)){
            ++state;
          } else if (is_point(line)){
            points.push_back(parsePoint(line));
          }
        } /* Ignore everything after end of list of vectors */
      }
    } else {
      cout << "Error opening file: \"points\"";
    }
    return 0;
}

bool is_point(const string& line){
  /* returns true if the first character of the line is '(' and last character is ')'
     this will match anything between parenthesis */
  return line[0] == '(' && line[line.length-1] == ')';
}

这是关于如何解析文件的概述。您需要做的是编写用于确定注释行的确切内容,注释块的开头,注释块的结尾等的功能。