从文件名中计算单词计数程序中的单词

时间:2015-02-28 04:31:04

标签: c linux

我似乎能够正确地获取线条,字符和空格。但是,我很难弄清楚如何计算单词。它们不一定是字典形式;例如,fdasf fdjsak fea将是三个单词。

这就是我所拥有的:

#include <stdio.h>

int main(int argc, char* argv[]) {
  int ccount = 0;
  int wcount = 1;
  int lcount = 0;
  int scount = 0;

  char* filename = argv[1];
  printf("filename is: %s\n", filename);

  FILE* infile;
  infile = fopen(filename, "r");

  if (infile == NULL) {
    printf("%s: is not a valid file\n", filename);
    return -1;
  }

  char c;
  while ((c = fgetc(infile)) != EOF) {
    if (c == ' ') {
      wcount++;
    }
    if (c == '\n') {
      lcount++;
    }
    if (c != ' ' || c != '\n') {
      ccount++;
    }
    if (c == ' ') {
      scount ++;
    }
  }

  printf("total number of lines: %d\n", lcount);
  printf("total number of characters: %d \n", ccount);
  printf("total number of non-whitespace characters: %d \n", scount );
  printf("total number of words: %d \n", wcount);

  return 0;
}

1 个答案:

答案 0 :(得分:1)

虽然有很多方法可以做到这一点,但这里有一个简短的示例,从stdin读取,您只需将stdin更改为infile即可。 infile)。这不会将空字符串(仅'\n')计为单词。您可以重新设计以满足您的需求。它包含解释逻辑的注释。如果您有疑问,请告诉我们:

#include <stdio.h>

int main (void) {

    char *line = NULL;  /* pointer to use with getline ()  */
    char *p = NULL;     /* pointer to parse getline return */
    ssize_t read = 0;   /* actual chars read per-line      */
    size_t n = 0;       /* max chars to read (0 - no limit)*/
    int spaces = 0;     /* counter for spaces and newlines */
    int total = 0;      /* counter for total words read    */

    printf ("\nEnter a line of text (or ctrl+d to quit)\n");

    while (printf ("\n input: ") && (read = getline (&line, &n, stdin)) != -1) 
    {
        /* strip trailing '\n' or '\r' */
        while (line[read-1] == '\n' || line[read-1] == '\r')
            line[--read] = 0;

        spaces = 0;
        p = line;

        if (read > 0) {        /* read = 0 covers '\n' case (blank line with [enter])  */
            while (*p) {                            /* for each character in line      */
                if (*p == '\t' || *p == ' ') {      /* if space,       */
                    while (*p == '\t' || *p == ' ') /* read all spaces */
                        p++;
                    spaces += 1;                    /* consider sequence of spaces 1   */
                } else
                    p++;                            /* if not space, increment pointer */
            }
            total += spaces + 1;                    /* words per-line = spaces + 1     */
        }

        printf (" chars read: %2zd,  spaces: %2d,  words: %2d,  total: %3d   | '%s'\n",
                read, spaces, (spaces > 0) ? spaces+1 : 0, 
                total, (read > 1) ? line : "[enter]");
    }

    printf ("\n\n  Total words read: %d\n\n", total);

    return 0;

}

<强>输出:

$ ./bin/countwords

Enter a line of text (or ctrl+d to quit)

 input: my dog has fleas
 chars read: 16,  spaces:  3,  words:  4,  total:   4   | 'my dog has fleas'

 input:
 chars read:  0,  spaces:  0,  words:  0,  total:   4   | '[enter]'

 input: fee fi fo fum
 chars read: 13,  spaces:  3,  words:  4,  total:   8   | 'fee fi fo fum'

 input:

  Total words read: 8

注意:要展开已识别的空白字符,您可以添加ctype.h标题并使用isspace()功能,而不是仅仅检查spacestabs一样,如上所述。遗漏的目的是将所需的头文件限制为stdio.h