同时遍历字符串的元素

时间:2018-07-17 13:04:57

标签: python

我有一本词典,其中包含以bookpage为关键字的句子:

# lists to build dictionary - for reproducibility  
pages     = [12, 41, 50, 111, 1021, 121]
bookCodes = ['M', 'P', 'A', 'C', 'A', 'M']

sentences = ['THISISASENTANCE',
             'ANDHEREISONEMOREEXAMP',
             'ALLFROMDIFFERENTBOOKS',
             'ANDFROMDIFFERENTPAGES',
             'MOSLTYTHESAMELENGTHSS',
             'BUTSOMEWILLBABITSHORT'
             ]

# Make dictionary 
coordinates = defaultdict(dict)
for i in range(len(pages)):
    book = bookCodes[i]
    page = pages[i]
    sentence = sentences[i]
    coordinates[book][page] = sentence 

print coordinates

defaultdict(<type 'dict'>, {'A': {50: 'ALLFROMDIFFERENTBOOKS', 1021: 'MOSLTYTHESAMELENGTHSS'}, 'P': {41: 'ANDHEREISONEMOREEXAMP'}, 'C': {111: 'ANDFROMDIFFERENTPAGES'}, 'M': {121: 'BUTSOMEWILLBABITSHORT', 12: 'THISISASENTANCE'}})

我还有一个作为字典存储的元音池,因此每个元音以10开头:

vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)

我想遍历每个句子(sentence[0][0]. sentence[n][0], ...)的相同元素,并且每次看到元音(AEI,{{1 }}或O)减少U字典中该元音的数量。

一旦元音池达到vowels,我将返回句子中的0letterposition,然后中断循环。

sentence

重要的是,from collections import defaultdict import random def wordStopper(sentences): random.shuffle(sentences) vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10) for i in range(len(sentences[1])): for s in sentences: try: l = s[i:i + 1] except IndexError: continue if l in vowels: vowels[l] -= 1 print("Pos: %s, Letter: %s, Sentence: %s" % (i, l, s)) print("As = %s, Es = %s, Is = %s, Os = %s, Us = %s" %(vowels['A'], vowels['E'], vowels['I'], vowels['O'], vowels['U'])) if vowels[l] == 0: return(l, i, s) letter, location, sentence = wordStopper(sentences) print("Vowel %s exhausted here %s in sentence: %s" % (letter, location, sentence)) 列表应重新排序(并且在所有句子中依次遍历元素sentences,然后遍历元素0),这样我才不会偏向于较早1列表中的条目。

这符合我的预期,但是我现在要检索从中提取sentences的{​​{1}}和book的数字,这些数字存储在page中。

我可以通过遍历sentence并找到从coordinates返回的coordinates来粗略地实现这一点:

sentence

但是这让我感到很遗憾,无法实现这一目标。

通常,我可能会在句子前对wordStopper的键进行迭代,但是我看不到这样做的方法,因此它不会使结果偏向于迭代的第一个键。

任何建议都非常欢迎 注意:这是一个玩具示例,所以我不想使用任何语料库解析工具

1 个答案:

答案 0 :(得分:1)

我认为您需要的是一个更好的数据结构,它使您可以从句子中检索书籍/页面。有很多可能的设计。这就是我要做的:

首先,创建一个包含句子及其书/页的数据结构:

class SentenceWithMeta(object):
    def __init__(self, sentence):
        self.sentence = sentence
        self.book = None
        self.page = None

然后,保留所有句子。例如:

sentences_with_meta = [SentenceWithMeta(sentence) for sentence in sentences]

这时,初始化句子和页元字段:book和page字段:

# Make dictionary
sentences_with_meta = [SentenceWithMeta(sentence) for sentence in sentences]
for i in range(len(pages)):
    book = bookCodes[i]
    page = pages[i]
    sentence_with_meta = sentences_with_meta[i]
    sentence_with_meta.book = book
    sentence_with_meta.page = page

最后,在wordStopper方法中,可以通过以下方式使用句子_with_meta数组:

def wordStopper(sentences):
    random.shuffle(sentences_with_meta)
    vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)
    for i in range(len(sentences[1])):
        for swm in sentences_with_meta:
            try:
                l = swm.sentence[i:i + 1]
    ...
    # the rest of the code is the same. You return swm, which has the book
    # and page already in the structure.

侧面节点:要从字符串中获取字母i,您无需使用slice。只需使用索引引用即可:

l = swm.sentence[i]

还有许多其他设计也可以使用。