将字符串拆分为单词数组

时间:2016-01-14 15:43:14

标签: c# arrays string

我想在不使用string的情况下将string.Split拆分为单词数组。我已经尝试过这段代码并且它正在工作,但无法将结果分配到数组

string str = "Hello, how are you?";
string tmp = "";
int word_counter = 0;
for (int i = 0; i < str.Length; i++)
{
     if (str[i] == ' ')
     {
         word_counter++;
     }
}
string[] words = new string[word_counter+1];

for (int i = 0; i < str.Length; i++)
{
    if (str[i] != ' ')
    {
        tmp = tmp + str[i];
        continue;
    }
    // here is the problem, i cant assign every tmp in the array
    for (int j = 0; j < words.Length; j++)
    {
        words[j] = tmp;
    }
    tmp = "";
}

3 个答案:

答案 0 :(得分:5)

您只需要一种index pointer将您的商品逐个放入数组:

string str = "Hello, how are you?";
string tmp = "";
int word_counter = 0;
for (int i = 0; i < str.Length; i++) {
    if (str[i] == ' ') {
        word_counter++;
    }
}
string[] words = new string[word_counter + 1];
int currentWordNo = 0; //at this index pointer
for (int i = 0; i < str.Length; i++) {
    if (str[i] != ' ') {
        tmp = tmp + str[i];
        continue;
    }
    words[currentWordNo++] = tmp; //change your loop to this
    tmp = "";
}
words[currentWordNo++] = tmp; //do this for the last assignment

在我的示例中,索引指针名为currentWordNo

答案 1 :(得分:4)

尝试使用正则表达式,如下所示:

  string str = "Hello, how are you?";

  // words == ["Hello", "how", "are", "you"] 
  string[] words = Regex.Matches(str, "\\w+")
    .OfType<Match>()
    .Select(m => m.Value)
    .ToArray();

String.Split不是一个好选择,因为要分割的太多字符:' '(空格),'.'','';''!'等。

Word 不是只是空格之间的东西,有标点符号要考虑,非破坏空格等。看看输入如下:

  string str = "Bad(very bad) input to test. . ."

注意

  1. “坏”之后没有空格
  2. 不间断的空间
  3. 完全停止后的添加空间
  4. 正确的输出应该是

      ["Bad", "very", "bad", "input", "to", "test"] 
    

答案 2 :(得分:0)

您还可以使用列表创建单词列表:

    string str = "Hello, how are you?";
    string tmp = "";
    List<string> ListOfWords = new List<string>();

    int j = 0;

    for (int i = 0; i < str.Length; i++)
    {
        if (str[i] != ' ')
        {
            tmp = tmp + str[i];
            continue;
        }
        // here is the problem, i cant assign every tmp in the array

        ListOfWords.Add(tmp);
        tmp = "";
    }
    ListOfWords.Add(tmp);

通过这种方式,您可以避免计算单词的数量,并且代码更简单。使用ListOfWord [x]读取任何单词