Tokenize字符串和存储结果在boost :: iterator_range <std :: string :: iterator> </std :: string :: iterator>

时间:2014-10-15 14:56:28

标签: c++ boost tokenize iterator-range

我需要将('','\ n','\ t'作为分隔符)标记为带有某些想法的文本,如

std::string text = "foo   bar";
boost::iterator_range<std::string::iterator> r = some_func_i_dont_know(text);

后来我想得到输出:

for (auto i: result)
    std::cout << "distance: " << std::distance(text.begin(), i.begin())
        << "\nvalue: " << i << '\n';

以上示例产生的内容:

distance: 0
value: foo
distance: 6
value: bar

感谢您的帮助。

1 个答案:

答案 0 :(得分:2)

我不会在这里使用古老的Tokenizer。只需使用String Algorithm的split产品:

Live On Coliru

#include <boost/algorithm/string.hpp>
#include <iostream>

using namespace boost;

int main()
{
    std::string text = "foo   bar";
    boost::iterator_range<std::string::iterator> r(text.begin(), text.end());

    std::vector<iterator_range<std::string::const_iterator> > result;
    algorithm::split(result, r, is_any_of(" \n\t"), algorithm::token_compress_on);

    for (auto i : result)
        std::cout << "distance: " << distance(text.cbegin(), i.begin()) << ", "
                  << "length: " << i.size() << ", "
                  << "value: '" << i << "'\n";
}

打印

distance: 0, length: 3, value: 'foo'
distance: 6, length: 3, value: 'bar'