unload()函数的问题(CS50第五周:Speller)

时间:2020-08-21 19:52:37

标签: c memory-management cs50

我正在处理CS50的第5周作业,Speller。我一次建立一个函数,而unload函数(第151行)却遇到了问题。现在,我只是以一种打印结果的方式测试迭代,然后再使用该迭代释放每个节点。我是通过按照释放这些节点的顺序将每个节点的单词更改为"FREE"来实现的。

函数调用(第60行)返回true,并且printf命令成功打印。但是,unload函数本身中的所有内容都将被忽略。我添加来查看其进度(printf)的DEBUG DEBUG DEBUG行均未打印。第63行上的print()函数调用应该在打印表时将所有单词设置为"FREE",并且所有词典单词位置都显示"NOT FOUND"。取而代之的是,它完全不改变地打印列表和位置,并且在for循环(第155行)中没有DEBUG打印命令触发。

我不明白为什么会这样。单独的unload()函数调用,无论它是否返回true,都应至少至少触发for循环中的第一个printf命令(第157行)。但是即使那样也被跳过了。

有人可以帮助我理解为什么该函数返回true却没有进行应做的任何更改吗?预先感谢。

编辑:好的,有人告诉我我在第60行没有正确调用unload函数。此后,我已对此进行了更正。现在它将打印出"LOCATION 00:",但是一旦它到达第158行的第一个while循环,它就会结束。我以前遇到过这个问题,我不确定为什么要这样做。 strcmp()应该看到头节点的单词与"FREE"不匹配,直到它从列表的末尾到开头为止。为什么while循环甚至不触发?

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>

unsigned int HASH_MAX = 50; // Max elements in hash table
unsigned int LENGTH = 20; // Max length of word to be stored

unsigned int hash(const char *word); // assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
bool load(FILE *dictionary); // load dictionary into memory
bool check(char *word); // check if word exists in dictionary
bool unload(void); // unload dictionary from memory, free memory (CURRENTLY DEBUGGING, CHECKING ITERATION)
void print(void); // print table contents and node locations

typedef struct _node // node structure: stored word, pointer to next node
{
    char *word[20];
    struct _node *next;
} node;

node *HASH_TABLE[50];

int main(int argc, char *argv[])
{
    FILE *dictionary = fopen("C:/Users/aaron/Desktop/Dictionary.txt", "r"); // open dictionary file, read

    if (!dictionary) // if dictionary is NULL, return error message, end program
    {
        printf("FILE NOT FOUND\n");
        return 1;
    }

    if (load(dictionary)) // if dictionary loaded successfully (function call), close dictionary and print table contents
    {
        fclose(dictionary);
        print(); // print "LIST (number): {(name, address), ...}\n
    }

    char *checkword = "Albatross"; // test check function for word that does not exist in the library
    char *checkword2 = "Riku"; // test check function for word that does exist in the library

    if (check(checkword)) // return check results for checkword, found or not found
    {
        printf("\n%s found\n", checkword);
    }
    else
    {
        printf("\n%s not found\n", checkword);
    }

    if (check(checkword2)) // return check results for checkword2, found or not found
    {
        printf("\n%s found\n", checkword2);
    }
    else
    {
        printf("\n%s not found\n", checkword2);
    }

    if (unload()) // if unloaded successfully (function call), print contents
    {
        printf("\nUNLOADED...\n\n"); // DEBUG DEBUG DEBUG (confirm unload function returned true)
        print();
    }
}

unsigned int hash(const char *word) // assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
{
    char word_conv[LENGTH + 1]; // store converted word for uniform key
    unsigned int code = 0; // hash code

    strcpy(word_conv, word);

    for (int i = 0; i < strlen(word); i++) // set all letters in the word to lower case
    {
        word_conv[i] = tolower(word_conv[i]);
    }

    for (int j = 0; j < strlen(word_conv); j++) // for all letters in converted word, add ascii value to code and multiply by 3
    {
        code += word_conv[j];
        code = code * 3;
    }

    code = code % HASH_MAX; // set code to remainder of current code divided by maximum hash table size

    return code;
}

bool load(FILE *dictionary) // load dictionary into memory
{
    char word[LENGTH+1]; // store next word in the dictionary

    while (!feof(dictionary)) // until end of dictionary file
    {
        fscanf(dictionary, "%s", word); // scan for next word

        node *new_n = malloc(sizeof(node)); // new node
        strcpy(new_n->word, word); // store scanned word in new node
        new_n->next = NULL; // new node's next pointer set to NULL

        unsigned int code = hash(word); // retrieve and store hash code

        if (HASH_TABLE[code] == NULL) // if hash location has no head
        {
            HASH_TABLE[code] = new_n; // set new node to location head
        }
        else if (HASH_TABLE[code] != NULL) // if head already exists at hash location
        {
            node *trav = HASH_TABLE[code]; // set traversal node

            while (trav->next != NULL) // while traversal node's next pointer is not NULL
            {
                trav = trav->next; // move to next node
            }

            if (trav->next == NULL) // if traversal node's next pointer is null
            {
                trav->next = new_n; // set new node to traversal node's next pointer
            }
        }
    }

    return true; // confirm successful load
}

bool check(char *word) // check if word exists in dictionary
{
    unsigned int code = hash(word); // retrieve and store hash code

    node *check = HASH_TABLE[code]; // set traversal node to hash location head

    while (check != NULL) // while traversal node is not NULL
    {
        int check_true = strcasecmp(check->word, word); // compare traversal node's word to provided word argument

        if (check_true == 0) // if a match is found, return true
        {
            return true;
        }
        else if (check_true != 0) // if no match, move to next node
        {
            check = check->next;
        }
    }

    if (check == NULL) // if end of list is reached without a match, return false
        return false;
}

bool unload(void) // unload dictionary from memory, free memory (CURRENTLY DEBUGGING, CHECKING ITERATION)
{
    char *word = "FREE"; // DEBUG DEBUG DEBUG (changin all nodes' words to "FREE" to test iteration)

    for (int i = 0; i < HASH_MAX; i++) // for every element in the hash table, HASH_MAX (50)
    {
        printf("LOCATION %02d:\n", i); // DEBUG DEBUG DEBUG (print current hash table location)
        while (strcmp(HASH_TABLE[i]->word, word) != 0) // while the head node's word is not "FREE"
        {
            node *trav = HASH_TABLE[i]; // set traversal node to head
            printf("HEAD WORD: %s\n", HASH_TABLE[i]->word); // DEBUG DEBUG DEBUG (print head word to confirm while condition)

            while (strcmp(trav->next->word, word) != 0) // while the traversal node's word is not "FREE"
            {
                trav = trav->next; // move to next node
                printf("."); // DEBUG DEBUG DEBUG (print a dot for every location skipped)
            }

            printf("\n"); // DEBUG DEBUG DEBUG

            strcpy(trav->word, word); // set traversal node's word to "FREE"

            printf("{"); // DEBUG DEBUG DEBUG

            while (trav != NULL) // DEBUG DEBUG DEBUG (print hash location's current list of words)
            {
                printf("%s, ", trav->word); // DEBUG DEBUG DEBUG
            }

            printf("}\n\n"); // DEBUG DEBUG DEBUG
        }
    }

    return true; // freed successfully
}

void print(void) // print hash table contents and node locations
{
    for (int i = 0; i < HASH_MAX; i++) // for every element in the hash table
    {
        node *check = HASH_TABLE[i]; // set traversal node to current hash table element head

        printf("LIST %02d: {", i); // print hash table element location

        while (check != NULL) // for all nodes in the current linked list
        {
            printf("%s, ", check->word); // print traversal node's word
            check = check->next; // move to next node
        }

        printf("}\n");
    }

    printf("\n");

    FILE *dictionary = fopen("C:/Users/aaron/Desktop/Dictionary.txt", "r"); // open dictionary file

    while (!feof(dictionary)) // for all words in the dictionary
    {
        char word[LENGTH + 1]; // store next word

        fscanf(dictionary, "%s", word); // scan for next word

        unsigned int code = hash(word); // retrieve and store word's hash code

        node *search = HASH_TABLE[code]; // set traversal node to hash location head

        while (search != NULL) // for all nodes at that location, or until word is found
        {
            if (strcasecmp(search->word, word) == 0) // compare traversal node's word to scanned word (case insensitive)
            {
                printf("%s: %p\n", search->word, search); // print traversal node's word and location
                break; // break while loop
            }
            else
            {
                search = search->next; // if traversal node's word does not match scanned word, move to next node
            }
        }

        if (search == NULL) // if the scanned word matches none of the words in the hash location's linked list
            printf("\"%s\" NOT FOUND\n", word); // word not found
    }

    fclose(dictionary); // close dictionary file
}

2 个答案:

答案 0 :(得分:2)

注意事项: chqrlie指出了许多基本问题,但这是一些重构的代码。


您的主要问题是unload实际上没有删除节点。


要注意的一件事是每个字符串使用tolower 一次更容易/更快/更好。

如果小写版本是我们在节点中存储的版本,并且我们将check中的搜索词小写,则可以使用strcmp代替strcasecmp [必须在每个循环迭代中为两个参数重做小写字母。]

因此,我更改了hash函数以使其参数“就地”小写。


正如我在上面的评论中提到的,print多余地重读了字典文件。因此,我删除了该代码。如果必须这样做,则应该进入另一个函数,或者load和/或check应该被重用。

(即)print应该做好一件事情 [一种编程准则]。


我个人不喜欢“侧边栏”评论:

if (unload()) // if unloaded successfully (function call), print contents

我更喜欢注释在行上方:

// if unloaded successfully (function call), print contents
if (unload())

对我来说,这更清楚了,它有助于防止行的宽度超过80个字符。


某些固定常数(例如HASH_MAXLENGTH)是全局变量。这样可以防止它们被用来定义数组

(例如)你不能说:

node *HASH_TABLE[HASH_MAX];

并且不得不将其“硬连线”为:

node *HASH_TABLE[50];

如果我们使用#defineenum定义它们,那么我们可以使用首选的定义。


做类似的事情:

for (int i = 0; i < strlen(word); i++)

将循环时间从O(length)延长到O(length ^ 2),因为strlen在循环内被称为“长度”时间,并且每次都重新扫描字符串。

要做的更好:

int len = strlen(word);
for (int i = 0; i < len; i++)

但是即使这样,缓冲区也需要额外扫描。最好做类似的事情:

for (int chr = *word++;  chr != 0;  chr = *word++)

我已经用错误注释重构了代码。原始代码放在#if 0块内:

#if 0
// old/original code
#else
// new/refactored code
#endif

无论如何,这是代码:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#if 1
#include <ctype.h>
#endif

// Max elements in hash table
#if 0
unsigned int HASH_MAX = 50;
#else
enum {
    HASH_MAX = 50
};
#endif

// Max length of word to be stored
#if 0
unsigned int LENGTH = 20;
#else
enum {
    LENGTH = 20
};
#endif

// assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
#if 0
unsigned int hash(const char *word);
#else
unsigned int hash(char *word);
#endif

// load dictionary into memory
bool load(FILE *dictionary);

// check if word exists in dictionary
#if 0
bool check(char *word);
#else
bool check(const char *word);
#endif

// unload dictionary from memory, free memory (CURRENTLY DEBUGGING,
// CHECKING ITERATION)
bool unload(void);

// print table contents and node locations
void print(void);

// node structure: stored word, pointer to next node
typedef struct _node {
#if 0
    char *word[20];
#else
    char word[LENGTH + 1];
#endif
    struct _node *next;
} node;

#if 0
node *HASH_TABLE[50];
#else
node *HASH_TABLE[HASH_MAX];
#endif

int
main(int argc, char *argv[])
{
    // open dictionary file, read
#if 0
    FILE *dictionary = fopen("C:/Users/aaron/Desktop/Dictionary.txt", "r");
#else
    FILE *dictionary = fopen("Dictionary.txt", "r");
#endif

    // if dictionary is NULL, return error message, end program
    if (!dictionary) {
        printf("FILE NOT FOUND\n");
        return 1;
    }

    // if dictionary loaded successfully (function call), close dictionary and
    // print table contents
    if (load(dictionary)) {
        fclose(dictionary);
        // print "LIST (number): {(name, address), ...}\n
        print();
    }

    // test check function for word that does not exist in the library
    char *checkword = "Albatross";

    // test check function for word that does exist in the library
    char *checkword2 = "Riku";

    // return check results for checkword, found or not found
    if (check(checkword)) {
        printf("\n%s found\n", checkword);
    }
    else {
        printf("\n%s not found\n", checkword);
    }

    // return check results for checkword2, found or not found
    if (check(checkword2)) {
        printf("\n%s found\n", checkword2);
    }
    else {
        printf("\n%s not found\n", checkword2);
    }

    // if unloaded successfully (function call), print contents
    if (unload()) {
        // DEBUG DEBUG DEBUG (confirm unload function returned true)
        printf("\nUNLOADED...\n\n");
        print();
    }
}

// assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
unsigned int
hash(char *word)
{
    // store converted word for uniform key
#if 0
    char word_conv[LENGTH + 1];
#endif

    // hash code
    unsigned int code = 0;

#if 0
    strcpy(word_conv, word);

    // set all letters in the word to lower case
    for (int i = 0; i < strlen(word); i++) {
        word_conv[i] = tolower(word_conv[i]);
    }

    // for all letters in converted word, add ascii value to code and multiply by 3
    for (int j = 0; j < strlen(word_conv); j++) {
        code += word_conv[j];
        code = code * 3;
    }
#else
    int chr;
    while (1) {
        chr = *word;

        if (chr == 0)
            break;

        chr = tolower(chr);
        *word++ = chr;

        code += chr;
        code *= 3;
    }
#endif

    // set code to remainder of current code divided by maximum hash table size
    code = code % HASH_MAX;

    return code;
}

// load dictionary into memory
bool
load(FILE * dictionary)
{
    // store next word in the dictionary
    char word[LENGTH + 1];

    // until end of dictionary file
// NOTE/BUG: don't use feof
#if 0
    while (!feof(dictionary)) {
        // scan for next word
        fscanf(dictionary, "%s", word);
#else
    // scan for next word
    while (fscanf(dictionary, "%s", word) == 1) {
#endif
        // new node
        node *new_n = malloc(sizeof(node));

        // store scanned word in new node
        strcpy(new_n->word, word);
        // new node's next pointer set to NULL
        new_n->next = NULL;

        // retrieve and store hash code
        unsigned int code = hash(new_n->word);

        // NOTE/BUG: there's no need to append to the end of the list -- pushing
        // on the front is adequate and is faster
#if 0
        // if hash location has no head
        if (HASH_TABLE[code] == NULL) {
            // set new node to location head
            HASH_TABLE[code] = new_n;
        }
        // if head already exists at hash location
        else if (HASH_TABLE[code] != NULL) {
            // set traversal node
            node *trav = HASH_TABLE[code];

            // while traversal node's next pointer is not NULL
            while (trav->next != NULL) {
                // move to next node
                trav = trav->next;
            }

            // if traversal node's next pointer is null
            if (trav->next == NULL) {
                // set new node to traversal node's next pointer
                trav->next = new_n;
            }
        }
#else
        new_n->next = HASH_TABLE[code];
        HASH_TABLE[code] = new_n;
#endif
    }

    // confirm successful load
    return true;
}

// check if word exists in dictionary
#if 0
bool
check(char *word)
#else
bool
check(const char *arg)
#endif
{
    char word[LENGTH + 1];

    // retrieve and store hash code
#if 1
    strcpy(word,arg);
#endif
    unsigned int code = hash(word);

    // set traversal node to hash location head
    node *check = HASH_TABLE[code];

    // while traversal node is not NULL
    while (check != NULL) {
        // compare traversal node's word to provided word argument
// NOTE/BUG: strcmp is faster than strcasecmp if we convert to lowercase _once_
#if 0
        int check_true = strcasecmp(check->word, word);
#else
        int check_true = strcmp(check->word, word);
#endif

#if 0
        // if a match is found, return true
        if (check_true == 0) {
            return true;
        }
        // if no match, move to next node
        else if (check_true != 0) {
            check = check->next;
        }
#else
        if (check_true == 0)
            return true;

        check = check->next;
#endif
    }

    // if end of list is reached without a match, return false
#if 0
    if (check == NULL)
        return false;
#else
    return false;
#endif
}

// unload dictionary from memory, free memory
// (CURRENTLY DEBUGGING, CHECKING ITERATION)
bool
unload(void)
{
    // DEBUG DEBUG DEBUG (changin all nodes' words to "FREE" to test iteration)
#if 0
    char *word = "FREE";
#endif

    // for every element in the hash table, HASH_MAX (50)
    for (int i = 0; i < HASH_MAX; i++) {
#if 0
        // DEBUG DEBUG DEBUG (print current hash table location)
        printf("LOCATION %02d:\n", i);
        // while the head node's word is not "FREE"
        while (strcmp(HASH_TABLE[i]->word, word) != 0) {
            // set traversal node to head
            node *trav = HASH_TABLE[i];

            // DEBUG DEBUG DEBUG (print head word to confirm while condition)
            printf("HEAD WORD: %s\n", HASH_TABLE[i]->word);

            // while the traversal node's word is not "FREE"
            while (strcmp(trav->next->word, word) != 0) {
                // move to next node
                trav = trav->next;
                // DEBUG DEBUG DEBUG (print a dot for every location skipped)
                printf(".");
            }

            // DEBUG DEBUG DEBUG
            printf("\n");

            // set traversal node's word to "FREE"
            strcpy(trav->word, word);

            // DEBUG DEBUG DEBUG
            printf("{");

            // DEBUG DEBUG DEBUG (print hash location's current list of words)
            while (trav != NULL) {
                // DEBUG DEBUG DEBUG
                printf("%s, ", trav->word);
            }

            // DEBUG DEBUG DEBUG
            printf("}\n\n");
        }
#else
        node *nxt;
        for (node *cur = HASH_TABLE[i];  cur != NULL;  cur = nxt) {
            nxt = cur->next;
            free(cur);
        }
        HASH_TABLE[i] = NULL;
#endif
    }

    // freed successfully
    return true;
}

// print hash table contents and node locations
void
print(void)
{
    // for every element in the hash table
    for (int i = 0; i < HASH_MAX; i++) {
        // set traversal node to current hash table element head
        node *check = HASH_TABLE[i];

        // print hash table element location
        printf("LIST %02d: {", i);

        // for all nodes in the current linked list
        while (check != NULL) {
            // print traversal node's word
            printf("%s, ", check->word);
            // move to next node
            check = check->next;
        }

        printf("}\n");
    }

    printf("\n");

// NOTE/BUG: why reread dictionary after printing it?
#if 0
    // open dictionary file
    FILE *dictionary = fopen("C:/Users/aaron/Desktop/Dictionary.txt", "r");

    // for all words in the dictionary
    while (!feof(dictionary)) {
        // store next word
        char word[LENGTH + 1];

        // scan for next word
        fscanf(dictionary, "%s", word);

        // retrieve and store word's hash code
        unsigned int code = hash(word);

        // set traversal node to hash location head
        node *search = HASH_TABLE[code];

        // for all nodes at that location, or until word is found
        while (search != NULL) {
            // compare traversal node's word to scanned word (case insensitive)
            if (strcasecmp(search->word, word) == 0) {
                // print traversal node's word and location
                printf("%s: %p\n", search->word, search);
                // break while loop
                break;
            }
            else {
                // if traversal node's word does not match scanned word,
                // move to next node
                search = search->next;
            }
        }

        // if the scanned word matches none of the words in the hash location's
        // linked list
        if (search == NULL)
            // word not found
            printf("\"%s\" NOT FOUND\n", word);
    }

    // close dictionary file
    fclose(dictionary);
#endif
}

这是一个删除了#if 0块的版本。

此外,我在load函数中添加了一些重新排序,以便将数据直接直接输入到node元素内的最终位置(即消除了中间缓冲区和一个strcpy

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <ctype.h>

// Max elements in hash table
enum {
    HASH_MAX = 50
};

// Max length of word to be stored
enum {
    LENGTH = 20
};

// assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
unsigned int hash(char *word);

// load dictionary into memory
bool load(FILE *dictionary);

// check if word exists in dictionary
bool check(const char *word);

// unload dictionary from memory, free memory (CURRENTLY DEBUGGING,
// CHECKING ITERATION)
bool unload(void);

// print table contents and node locations
void print(void);

// node structure: stored word, pointer to next node
typedef struct _node {
    char word[LENGTH + 1];
    struct _node *next;
} node;

node *HASH_TABLE[HASH_MAX];

int
main(int argc, char *argv[])
{
    // open dictionary file, read
    FILE *dictionary = fopen("Dictionary.txt", "r");

    // if dictionary is NULL, return error message, end program
    if (!dictionary) {
        printf("FILE NOT FOUND\n");
        return 1;
    }

    // if dictionary loaded successfully (function call), close dictionary and
    // print table contents
    if (load(dictionary)) {
        fclose(dictionary);
        // print "LIST (number): {(name, address), ...}\n
        print();
    }

    // test check function for word that does not exist in the library
    char *checkword = "Albatross";

    // test check function for word that does exist in the library
    char *checkword2 = "Riku";

    // return check results for checkword, found or not found
    if (check(checkword)) {
        printf("\n%s found\n", checkword);
    }
    else {
        printf("\n%s not found\n", checkword);
    }

    // return check results for checkword2, found or not found
    if (check(checkword2)) {
        printf("\n%s found\n", checkword2);
    }
    else {
        printf("\n%s not found\n", checkword2);
    }

    // if unloaded successfully (function call), print contents
    if (unload()) {
        // DEBUG DEBUG DEBUG (confirm unload function returned true)
        printf("\nUNLOADED...\n\n");
        print();
    }
}

// assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
unsigned int
hash(char *word)
{
    // store converted word for uniform key

    // hash code
    unsigned int code = 0;

    unsigned char chr;
    while (1) {
        chr = *word;

        if (chr == 0)
            break;

        chr = tolower(chr);
        *word++ = chr;

        code += chr;
        code *= 3;
    }

    // set code to remainder of current code divided by maximum hash table size
    code = code % HASH_MAX;

    return code;
}

// load dictionary into memory
bool
load(FILE *dictionary)
{

    // scan for next word
    while (1) {
        // new node
        node *new_n = malloc(sizeof(node));

        if (fscanf(dictionary, "%s", new_n->word) != 1) {
            free(new_n);
            break;
        }

        // store scanned word in new node
        new_n->next = NULL;

        // retrieve and store hash code
        unsigned int code = hash(new_n->word);

        // pushing on the front of the list is adequate and is faster
        new_n->next = HASH_TABLE[code];
        HASH_TABLE[code] = new_n;
    }

    // confirm successful load
    return true;
}

// check if word exists in dictionary
bool
check(const char *arg)
{
    char word[LENGTH + 1];

    // retrieve and store hash code
    strcpy(word,arg);
    unsigned int code = hash(word);

    // set traversal node to hash location head
    node *check = HASH_TABLE[code];

    // while traversal node is not NULL
    while (check != NULL) {
        // compare traversal node's word to provided word argument
        int check_true = strcmp(check->word, word);

        if (check_true == 0)
            return true;

        check = check->next;
    }

    // if end of list is reached without a match, return false
    return false;
}

// unload dictionary from memory, free memory
// (CURRENTLY DEBUGGING, CHECKING ITERATION)
bool
unload(void)
{

    // for every element in the hash table, HASH_MAX (50)
    for (int i = 0; i < HASH_MAX; i++) {
        node *nxt;
        for (node *cur = HASH_TABLE[i];  cur != NULL;  cur = nxt) {
            nxt = cur->next;
            free(cur);
        }
        HASH_TABLE[i] = NULL;
    }

    // freed successfully
    return true;
}

// print hash table contents and node locations
void
print(void)
{
    // for every element in the hash table
    for (int i = 0; i < HASH_MAX; i++) {
        // set traversal node to current hash table element head
        node *check = HASH_TABLE[i];

        // print hash table element location
        printf("LIST %02d: {", i);

        // for all nodes in the current linked list
        while (check != NULL) {
            // print traversal node's word
            printf("%s, ", check->word);

            // move to next node
            check = check->next;
        }

        printf("}\n");
    }

    printf("\n");
}

更新:

您能解释一下for (int chr = *word++; chr != 0; chr = *word++)吗?在这种情况下,我不知道*word++是什么意思。

好的。使用chr = *word++;意味着取消引用word [char指针]。这将获取char所指向的word值(即从内存中获取值)。然后,将此值设置为chr。然后,递增word [,使其指向数组中的 next 字符。

该语句由三个运算符组成:=是赋值运算符。 *是取消引用运算符,而++后减量运算符。

基于运算符的优先级[和/或绑定],*具有更高的优先级[tighter binding],因此首先执行。该值放在chr中。然后,对++中的值执行word。如下所示,它是作为单个语句执行的:

chr = *word;
word += 1;

出于我的回答中所述的原因,

chr = tolower(chr);应该为chr = tolower((unsigned char)chr);。另外,您可以将chr定义为unsigned char chr;

我的印象是tolower等。等是对此的“自我保护”(例如,他们进行了unsigned char强制转换)。但是,如果值超出范围,则[linux]联机帮助页会显示其UB。我已经编辑了第二个示例以使用unsigned char chr;

奇怪的是,对于glibc的tolower,它已经 构建了一个范围检查,该范围检查可以在int值上工作并返回原始值(即 not < / em>到转换表的索引中)(如果该值超出范围)。这似乎是某些BSD兼容性的一部分[BSD手册页指出它已进行范围检查,但已弃用该功能]。我猜是在写完联机帮助页之后添加了glibc范围检查。

对我来说,宏应该自己进行强制类型转换(以及全局函数)。但是,我认为这可能会破坏BSD的兼容性。

但是,由于向后兼容性,我们现在都束手无策[或添加包装宏]。


hash对其论点有副作用,并且使strcmp中的check发挥作用是必要的,这也令人困惑。

副作用{可能]不比strtok更严重。也就是说,它不是在修改隐藏/不相关的全局等。

IMO,如果对效果进行了评论,我也不会感到困惑[我在答复文本中记录了它]。将hash重命名为更具描述性的名称可能会有所帮助。我们可以做:take_hash_of_argument_that_we_modify_to_lowercase_first

这会使函数名称像某些文件一样“自我记录”(例如“叔叔”鲍勃·马丁(?))。可能建议成员函数应该是。

但是,也许hash_and_lowercase可能更好。对于读者来说,这可能是足够的线索,他们需要查阅该函数的API文档,而不是假设他们仅从名称中了解所有信息。

使用strcmp进行链表遍历要快得多,因此,至少[体系结构上]我们希望将小写字符串存储在节点中。我们不想在每次扫描中为每个节点重复使用小写字母。而且,我们不希望strcasecmp在每次循环迭代时都重复word [和节点中的字符串]的小写字母。

正如您所说,我们可以具有两个功能。而且我们仍然可以实现这种重构:tolower的基于字符串的版本,将其参数小写,并将hash保留为原来的样子。

最初,我考虑了这种方法。我很快意识到,在所有执行哈希操作的地方,都希望将其放在小写的字符串上。我们可以通过(例如)实现这一点:

strlower(word);
value = hash(word);

但是,这里没有用例来单独进行这些调用中的一个,而是成对执行。

那么,既然如此,为什么还要扫描参数字符串两次并将操作减慢2倍?

从肯尼迪国际机场[在失败的猪湾入侵之后]:如果我们承认,则错误不是错误

因此,我将其解释为:如果我们记录它们,则副作用不是错误

答案 1 :(得分:1)

您的代码中存在多个问题:

  • word结构的_node成员的类型错误:它应该是20个字符的数组,而不是20个char指针的数组。并且不要使用_node,保留以_开头的标识符。将定义更改为:

      typedef struct node { // node structure: stored word, pointer to next node
          char word[LENGTH+1];
          struct node *next;
      } node;
    
  • 您的读取循环不正确:while (!feof(dictionary))不是检测文件结尾的正确测试,您应该测试fscanf()是否成功读取下一个单词:

      while (fscanf(dictionary, "%s", word) == 1) // until end of dictionary file
    

    此外,您应该为fscanf()指定最大长度,以避免长字出现不确定的行为:

      while (fscanf(dictionary, "%19s", word) == 1) // read at most 19 characters
    
  • 您不检查分配失败。

  • 有许多冗余测试,例如else if (HASH_TABLE[code] != NULL)中的if (trav->next == NULL)load()else if (check_true != 0)中的if (check == NULL)check()。 / p>

  • 您不会在DEBUG代码的循环trav中修改while (trav != NULL),从而导致无限循环。

释放unload()中的字典并不难,您的迭代检查代码太复杂了,您已经拥有print()的正确迭代代码。这是一个简单的示例:

bool unload(void) { // unload dictionary from memory, free memory
    for (int i = 0; i < HASH_MAX; i++) {
        while (HASH_TABLE[i]) {
            node *n = HASH_TABLE[i];
            HASH_TABLE[i] = n->next;
            free(n);
        }
    }
    return true;
}

还请注意,无需存储转换后的单词来计算哈希值,并且char值必须强制转换为(unsigned char)才能传递给tolower(),因为此函数仅为unsigned char的值和特殊负值EOF定义。 char可能是带符号的类型,因此tolower(word[i])对于扩展字符具有不确定的行为。

unsigned int hash(const char *word) // assign hash code -- [(code + current letter) * 3] * string length, % HASH_MAX
{
    unsigned int code = 0; // hash code

    for (int i = 0; word[i] != '\0'; i++) {
        // compute hashcode from lowercase letters
        code = (code + tolower((unsigned char)word[i])) * 3;
    }

    code = code % HASH_MAX; // set code to remainder of current code divided by maximum hash table size

    return code;
}