我怎么知道这个?

时间:2016-11-23 14:45:23

标签: list python-3.x text compression

如何让此程序将文件压缩为单词列表和位置列表以重新创建原始文件。然后获取压缩文件并重新创建原始文件的全文,包括标点符号和大小写。

startsentence = input("Please enter a sentence: ")
sentence = (startsentence)
a = startsentence.split(" ")
dict = dict()
number = 1
positions = []
for j in a:
    if j not in dict:
        dict[j] = str(number)
        number = number + 1
    positions.append(dict[j])
print (positions)


print(positions)
f = open("postions.txt", "w") 
f.write( str(positions) + "\n"  )
f.close()

print(sentence)
f = open("words.txt", "w") 
f.write( str(startsentence) + "\n"  ) 
f.close() 

1 个答案:

答案 0 :(得分:0)

目前,您正在撰写整个startsentence,而不仅仅是单词:

f = open("words.txt", "w") 
f.write( str(startsentence) + "\n"  ) 
f.close()

您只需要编写唯一的单词及其索引,并且您已经创建了包含这些单词及其索引dict的字典(顺便说一下,您真的不应该使用dict作为变量名,我将使用dct)。您只需要根据它们的值(使用with语句)对它们进行排序:

with open("words.txt", "w") as f:
    f.write(' '.join(sorted(dct, key=dct.get)) + '\n')

假设你有一个职位列表(BTW:从0开始比从1开始容易得多)和一个单词列表,那么恢复很简单:

with open('positions.txt') as pf, open('words.txt' as wf:
    positions = [int(p) for p in pf.read().split()]  
    words = wf.read().strip().split()

recovered = ' '.join(words[p] for p in positions) # p-1 if you start from 1