MongoDB - 删除重复文档(GeoJSON)

时间:2015-05-14 05:16:11

标签: node.js mongodb mongoose geojson

我想知道从MongoDB中存储的大型GeoJSON集合(大约80k行)中删除重复文档的最佳方法。我相信重复项会导致前端出错,因为我无法将完整集合记录到控制台。

我试图在mongo shell中使用dropDups方法,如下面的url所述,但没有成功.. MongoDB query to remove duplicate documents from a collection。另外我认为dropDups从MongoDB 2.6开始折旧

以下是我的架构结构示例:

{
  "type": "FeatureCollection",
     "features": [
        {

           "geometry": {
              "type": "Point","coordinates": [-73.994720, 40.686902]
           }
        },
        {

           "geometry": {
              "type": "Point","coordinates": [-73.994720, 40.686902]
           }
        },
        {

           "geometry": {
              "type": "Point","coordinates": [-73.989205, 40.686675]
           }
        },
        {

           "geometry": {
              "type": "Point","coordinates": [-73.994655, 40.687391]               
           }
        },
        {
           "geometry": {
              "type": "Point","coordinates": [-73.985557, 40.687683]               
           }
        },
        {

           "geometry": {
              "type": "Point","coordinates": [-73.985557, 40.687683]
           }
        },
        {
           "geometry": {
              "type": "Point","coordinates": [-73.984656, 40.685462]
           }
        },

        ]
}

这是mongo shell中的创建索引尝试,重复项仍然存在!

> db.testschema.createIndex( { coordinates: 1 }, { unique: true, dropdups: true } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.testschema.createIndex( { geometry: 1 }, { unique: true, dropdups: true      } )
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
> db.testschema.ensureIndex({'testschema.features.geometry.coordinates': 1}, {unique: true, dropdups: true})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 3,
"numIndexesAfter" : 4,
"ok" : 1
}
  

0 个答案:

没有答案