我有一个MongoDB数据库,其中包含一个表,该表包含大约1200万条记录,并且在Data.Report.FIELD字段上具有索引。我正在尝试获取其所有价值。如果我用一个大的游标执行此操作,它就会死掉,所以我将其拆分为100K文档的片段。
for i in range (10):
for a in data.find({'Data.Report.FIELD': {"$gt": (i*100000), "$lt": (i+1)*100000+1}},{'Data.Report.FIELD':1}):
if ('FIELD' in a['Data']['Report'][0].keys()):
_ids.append([a['_id'], a['Data']['Report'][0]['FIELD']])
_FIELDs.append(a['Data']['Report'][0]['FIELD'])
good+=1
else:
bad+=1
print ('Done with ', i, ' hundred thousand. Time: ', time.time()-start, 'seconds.')
我得到的是这样的:
Done with 0 hundred thousand. Time: 116.90340232849121 seconds.
Done with 1 hundred thousand. Time: 182.20432806015015 seconds.
Done with 2 hundred thousand. Time: 2561.886509180069 seconds.
Done with 3 hundred thousand. Time: 4840.841073274612 seconds.
20万份文档后,它变得如此疯狂的缓慢原因可能是什么?有什么我可以改变的吗?可能是服务器问题吗?
UPD: 索引:
{'_id_': {'v': 2, 'key': [('_id', 1)], 'ns': 'admin.ECR0618'},
'FIELD': {'v': 2, 'key': [('Data.Report.FIELD', 1)], 'ns': 'admin.ECR0618',
'background': False}}
基本统计:
'ns': 'admin.ECR0618',
'size': 176633446637.0,
'count': 11782003,
'avgObjSize': 14991,
'storageSize': 59065884672.0,
'capped': False,