Question

假设我要读取的文件中有一行：

>NZ_FNBK01000055.1 Halorientalis regularis

因此，如何从以大于号开头的行中提取名称；大于号后的所有内容（并在行末排除换行符）均为名称。名称应为：

NZ_FNBK01000055.1 Halorientalis regularis

到目前为止，这是我的代码：

bool file::load(istream& file)
{
string line;
while(getline(genomeSource, line)){
    if(line.find(">") != string::npos)
    {
        m_name = 
    }
}
return true;
}

Answer 1

您可以使用正则表达式轻松处理这两种情况。 c ++在c ++ 11中引入了<regex>。使用此和正则表达式，例如：

>.*? (.*?) .*$

>获取文字字符
.*?非贪婪地搜索在空格处停止的任何内容 (.*?)非贪婪搜索可以阻止在空格处停止但将字符先分组的任何东西。
.*$贪婪地搜索直到字符串的结尾。

使用此方法，您可以轻松检查此行是否符合您的条件并同时获取名称。 Here是显示其正常工作的测试。对于代码，c ++ 11 regex lib非常简单：

std::string s = ">NZ_FNBK01000055.1 Halorientalis regularis    "; 
std::regex rgx(">.*? (.*?) .*$"); // Make the regex
std::smatch matches;

if(std::regex_search(s, matches, rgx)) { // Do a search
    if (matches.size() > 1) { // If there are matches, print them.
        std::cout << "The name is " << matches[1].str() << "\n"; 
    }
}

Here是一个实时示例。

如何从一行中提取名称

1 个答案: