清理Python输出显示

时间:2018-08-05 19:34:41

标签: python tiff

我有代码:

from sklearn.feature_extraction.text import TfidfVectorizer
titles = open("user1_titles.txt",'r')
vectorizer = TfidfVectorizer(min_df=1)
X = vectorizer.fit_transform(titles)
idf = vectorizer.idf_
print(dict(zip(vectorizer.get_feature_names(), idf)), file = open("user1_tf.csv",'a'))

但这给了我以下输出:

{'00': 7.8987145343299883, '007': 9.6034626265684135, '01': 9.6034626265684135, '012': 9.197997518460248, '01273': 9.6034626265684135, '02': 9.6034626265684135, '020': 9.6034626265684135, '026514': 9.6034626265684135,... etc

我需要的输出是:

00 7.8987145343299883
007 9.6034626265684135
etc.    

我的目标是从输出中删除大括号{},仅包含2列数据;名称和值

3 个答案:

答案 0 :(得分:0)

使用pprint-一种“数据漂亮的打印机”:

from pprint import pprint

d = {'00': 7.8987145343299883, '007': 9.6034626265684135, '01': 9.6034626265684135, '012': 9.197997518460248, '01273': 9.6034626265684135, '02': 9.6034626265684135, '020': 9.6034626265684135, '026514': 9.6034626265684135}
pprint(d)

输出:

{'00': 7.898714534329988,
 '007': 9.603462626568414,
 '01': 9.603462626568414,
 '012': 9.197997518460248,
 '01273': 9.603462626568414,
 '02': 9.603462626568414,
 '020': 9.603462626568414,
 '026514': 9.603462626568414}

或使用format手工制作的解决方案:

for key, value in d.items():
    print( '{:>6} {}'.format(key, value) )

结果:

026514 9.603462626568414
   012 9.197997518460248
    01 9.603462626568414
    00 7.898714534329988
   020 9.603462626568414
   007 9.603462626568414
    02 9.603462626568414
 01273 9.603462626568414

答案 1 :(得分:0)

您可以按照以下方式进行操作

for key,value in (dict(zip(vectorizer.get_feature_names(), idf)), file = open("user1_tf.csv",'a')).iteritems()::
    print key,value

或 您可以在某些变量中收集第一个语句的输出,然后打印出来 喜欢:

data = print(dict(zip(vectorizer.get_feature_names(), idf)), file = open("user1_tf.csv",'a'))
for key,value in data.iteritems():
    print key,value

答案 2 :(得分:0)

这与Saurabh的答案基本相同,但会打印出值。

def splitPrint(data):
    for key,value in data.items():
        print(key, value)