修复Python代码

时间:2017-04-26 15:20:42

标签: python python-2.7

我试图实现一个名为CharCounter的迭代器类。此类打开一个文本文件,并提供一个迭代器,该文本文件返回包含用户指定字符数的文本文件中的单词。它应该每行输出一个单词。这不是它正在做什么,它将这些单词作为列表输出,然后它不断输出' a'。我该如何修复我的代码?

class CharCounter(object):
     def __init__(self, fileNm, strlen):
        self._fileNm = fileNm
        self._strlen = strlen
        fw = open(fileNm)
        text = fw.read()

        lines = text.split("\n")
        words = []
        pwords =[]

        for each in lines:
            words += each.split(" ")

        chkEnd = ["'",'"',",",".",")","("]
        if words[-1] in chkEnd:
            words = words.rstrip()

        for each in words:
            if len(each) == strlen:
                 pwords.append(each)

        print(pwords)

     def __iter__(self):
         return CharCounterIterator(self._fileNm)

class CharCounterIterator(object):
    def __init__(self,fileNm):
        self._fileNm = fileNm
        self._index = 0

    def __iter__(self):
        return self

    def next(self):
        try:
            ret = self._fileNm[self._index]
            return ret
         except IndexError:
             raise StopIteration

if __name__=="__main__":
     for word in CharCounter('agency.txt',11):
        print "%s" %word

1 个答案:

答案 0 :(得分:0)

在SO上发布的代码不应该读取文件,除非问题是关于读取文件。结果无法复制和验证。 (参见MCVE。)而是将文本字符串定义为文件的替身。

您的代码会将长度为n的单词打印为列表,因为这是您要求它对print(pwords)执行的操作。它会重复打印文件名的第一个字符,因为这是您要求它在__next__方法中执行的操作。

您的班级__init__比您描述的更多。试图从单词中删除标点符号并不起作用。下面的代码定义了一个类,它将文本转换为剥离的单词列表(带有重复项)。它还定义了一个过滤单词列表的参数化生成器方法。

class Words:
    def __init__(self, text):
        self.words = words = []
        for line in text.split('\n'):
            for word in line.split():
                words.append(word.strip(""",'."?!()[]{}*$#"""))
    def iter_n(self, n):
        for word in self.words:
            if len(word) == n:
                yield word

# Test
text = """
It should output a word per line.
Which is not what's it's doing!
(It outputs the words as a [list] and then continuously outputs 'a'.)
How can I fix my #*!code?
"""
words = Words(text)
for word in words.iter_n(5):
    print(word)

# Prints
Which
doing
words