根据单词分配一定的分数

时间:2013-06-13 16:36:16

标签: c#

谢谢解决。

我的words.txt文件如下所示:

await   -1

awaited -1

award   3

awards  3

值以制表符分隔。首先,我希望得到await = -1 point的结果,并根据words.txt文件为我的comment.txt文件中的每个句子提供分数。程序的输出应该像(例如)

-1.0

2.0

0.0

5.0

我被困住了,不知道接下来应该做什么。到目前为止,我只设法阅读words.txt文件。

    const char DELIM = '\t'; 
    const string FILENAME = @"words.txt"; 

    string record;  
    string[] fields; 

    FileStream inFile; 
    StreamReader reader; 


    inFile = new FileStream(FILENAME, FileMode.Open, FileAccess.Read);

    reader = new StreamReader(inFile);

    record = reader.ReadLine();

    //Spliting up a string using delimiter and
    //storing the spilt strings into a string array
    fields = record.Split(DELIM);

    double values = double.Parse(fields[1]);
    string words = fields[0];

4 个答案:

答案 0 :(得分:1)

你应该看看dictionary你可以将你想要得分的每个单词与他在字典中的值相匹配。这样你就可以循环所有的单词并输出值

using System;
using System.Collections.Generic;

class Program
{
    static void Main()
    {
        Dictionary<string, int> dictionary = new Dictionary<string, int>();
        dictionary.Add("await", -1);
        dictionary.Add("awaited", -1);
        dictionary.Add("award", 3);
        dictionary.Add("awards", 3);

        //read your file
        //split content on the splitter (tab) in an array

        for(int i=0; i<array.Length; i++)
        {
            //output the value
        }
    }
}

答案 1 :(得分:1)

如果您想使用正则表达式方法,请尝试使用

using (FileStream fileStream = new FileStream(FILENAME, FileMode.Open, FileAccess.Read)) {
  using (StreamReader streamReader = new StreamReader(fileStream)) {
    String record = streamReader.ReadLine();
    foreach (String str in record.Split('\t')) {
      Console.WriteLine(Regex.Replace(str, @"[^-?\d+]", String.Empty));
    }
    streamReader.Close();
  }
  fileStream.Close();
}

使用words.txt测试

await -1    awaited -1  awaited -1  award 3 award 2 award 1 award 3 awards 3

答案 2 :(得分:0)

结合vadz的答案和im_a_noob的答案,您应该能够阅读您的words.txt文件并将其放入字典中。

    Dictionary<string, double> wordDictionary = new Dictionary<string, double>();
    using (FileStream fileStream = new FileStream(FILENAME, FileMode.Open, FileAccess.Read))
        {
            using (StreamReader reader = new StreamReader(fileStream))
            {
                int lineCount = 0;
                int skippedLine = 0;
                while( !reader.EndOfStream)
                {
                    string[] fields = reader.ReadLine().Split('\t');
                    string word = fields[0];
                    double value = 0;
                    lineCount++;

                    //this check verifies there are two elements, tries to parse the second value and checks that the word 
                    //is not already in the dictionary
                    if (fields.Count() == 2 && double.TryParse(fields[1], out value) && !wordDictionary.ContainsKey(word))
                    {
                        wordDictionary.Add(word, value);
                    }
                    else{
                        skippedLine++;
                    }
                }

                Console.WriteLine(string.Format("Total Lines Read: {0}", lineCount));
                Console.WriteLine(string.Format("Lines Skipped: {0}", skippedLine));
                Console.WriteLine(string.Format("Expected Entries in Dictonary: {0}", lineCount - skippedLine));
                Console.WriteLine(string.Format("Actual Entries in Dictionary: {0}", wordDictionary.Count()));

                reader.Close();
            }
            fileStream.Close();
        }

要对句子进行评分,您可以使用以下内容。

    string fileText = File.ReadAllText(COMMENTSTEXT); //COMMENTSTEXT = comments.txt
    // assumes sentences end with a period, won't take into account any other periods in sentence
    var sentences = fileText.Split('.'); 

    foreach( string sentence in sentences )
    {
        double sentenceScore = 0;

        foreach (KeyValuePair<string, double> word in wordDictionary)
        {
            sentenceScore += sentence.Split(' ').Count(w => w == word.Key) * word.Value; 
        }

        Console.WriteLine(string.Format("Sentence Score = {0}", sentenceScore));
    }

答案 3 :(得分:0)

没有字典的工作解决方案:

using System.IO;
using System.Text.RegularExpressions; 

class Program
{
    static void Main(string[] args)
    {
        foreach (var comment in File.ReadAllLines(@"..\..\comments.txt"))
            Console.WriteLine(GetRating(comment));

        Console.ReadLine();
    }

    static double GetRating(string comment)
    {
        double rating = double.NaN;

        var wordsLines = from line in File.ReadAllLines(@"..\..\words.txt")
                         where !String.IsNullOrEmpty(line)
                         select Regex.Replace(line, @"\s+", " ");

        var wordRatings = from wLine in wordsLines
                          select new { Word = wLine.Split()[0],  Rating = Double.Parse(wLine.Split()[1]) };


        foreach (var wr in wordRatings)
        {
            if (comment.ToLower().Split(new Char[] {' ', ',', '.', ':', ';'}).Contains(wr.Word))
                rating = wr.Rating;
        }

        return rating;
    }
}