Mongodb - 没有结果返回时表现不佳

时间:2015-11-23 11:43:24

标签: mongodb mongodb-indexes

我有Mongodb集合,其中包含大约700万个代表地点的文档。

我运行一个查询,搜索名称以特定位置附近的前缀开头的地方。

我们有一个复合索引,如下所述,以加快搜索速度。

当搜索查询找到匹配(即使只有一个)时,查询执行速度非常快(~20 milisec)。但是当没有匹配时,查询执行可能需要30秒。

请协助。

详细信息:

每个地方(geoData)都有以下字段:

"loc" - a GeoJSON point that represent the location
"categoriesIds" - array of int ids
"name" - the name of the placee

此集合定义了以下索引:

{
  "loc" : "2dsphere",
  "categoriesIds" : 1,
  "name" : 1
}

查询是:

db.geoData.find({
  "loc":{
    "$near":{
      "$geometry":{
        "type": "Point" ,
        "coordinates": [ -0.10675191879272461 , 51.531600743186644]
      },
      "$maxDistance": 5000.0
    }
  }, 
  "categoriesIds":{
    "$in": [ 1 , 2 , 71 , 70 , 74 , 72 , 73 , 69 , 44 , 26 , 27 , 33 , 43 , 45 , 53 , 79]
  }, 
  "name":{ "$regex": "^Cafe Ne"}
})

执行统计 (Link to the whole explain result

    "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 1,
    "executionTimeMillis" : 169,
    "totalKeysExamined" : 14333,
    "totalDocsExamined" : 1,
    "executionStages" : {
        "stage" : "GEO_NEAR_2DSPHERE",
        "nReturned" : 1,
        "executionTimeMillisEstimate" : 60,
        "works" : 14354,
        "advanced" : 1,
        "needTime" : 14351,
        "needFetch" : 0,
        "saveState" : 361,
        "restoreState" : 361,
        "isEOF" : 1,
        "invalidates" : 0,
        "keyPattern" : {
            "loc" : "2dsphere",
            "categoriesIds" : 1,
            "name" : 1
        },
        "indexName" : "loc_2dsphere_categoriesIds_1_name_1",
        "searchIntervals" : [ 
            {
                "minDistance" : 0,
                "maxDistance" : 3408.329295346151,
                "maxInclusive" : false
            }, 
            {
                "minDistance" : 3408.329295346151,
                "maxDistance" : 5000,
                "maxInclusive" : true
            }
        ],
        "inputStages" : [ 
            {
                "stage" : "FETCH",
                "nReturned" : 1,
                "executionTimeMillisEstimate" : 20,
                "works" : 6413,
                "advanced" : 1,
                "needTime" : 6411,
                "needFetch" : 0,
                "saveState" : 361,
                "restoreState" : 361,
                "isEOF" : 1,
                "invalidates" : 0,
                "docsExamined" : 1,
                "alreadyHasObj" : 0,
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "filter" : {
                        "TwoDSphereKeyInRegionExpression" : true
                    },
                    "nReturned" : 1,
                    "executionTimeMillisEstimate" : 20,
                    "works" : 6413,
                    "advanced" : 1,
                    "needTime" : 6411,
                    "needFetch" : 0,
                    "saveState" : 361,
                    "restoreState" : 361,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "keyPattern" : {
                        "loc" : "2dsphere",
                        "categoriesIds" : 1,
                        "name" : 1
                    },
                    "indexName" : "loc_2dsphere_categoriesIds_1_name_1",
                    "isMultiKey" : true,
                    "direction" : "forward",
                    "indexBounds" : {
                        "loc" : [ 
                            "[\"2f1003230\", \"2f1003230\"]", 
                            "[\"2f10032300\", \"2f10032300\"]", 
                            "[\"2f100323000\", \"2f100323000\"]", 
                            "[\"2f1003230001\", \"2f1003230001\"]", 
                            "[\"2f10032300012\", \"2f10032300013\")", 
                            "[\"2f1003230002\", \"2f1003230002\"]", 
                            "[\"2f10032300021\", \"2f10032300022\")", 
                            "[\"2f10032300022\", \"2f10032300023\")", 
                            "[\"2f100323003\", \"2f100323003\"]", 
                            "[\"2f1003230031\", \"2f1003230031\"]", 
                            "[\"2f10032300311\", \"2f10032300312\")", 
                            "[\"2f10032300312\", \"2f10032300313\")", 
                            "[\"2f10032300313\", \"2f10032300314\")", 
                            "[\"2f1003230032\", \"2f1003230032\"]", 
                            "[\"2f10032300320\", \"2f10032300321\")", 
                            "[\"2f10032300321\", \"2f10032300322\")"
                        ],
                        "categoriesIds" : [ 
                            "[1.0, 1.0]", 
                            "[2.0, 2.0]", 
                            "[26.0, 26.0]", 
                            "[27.0, 27.0]", 
                            "[33.0, 33.0]", 
                            "[43.0, 43.0]", 
                            "[44.0, 44.0]", 
                            "[45.0, 45.0]", 
                            "[53.0, 53.0]", 
                            "[69.0, 69.0]", 
                            "[70.0, 70.0]", 
                            "[71.0, 71.0]", 
                            "[72.0, 72.0]", 
                            "[73.0, 73.0]", 
                            "[74.0, 74.0]", 
                            "[79.0, 79.0]"
                        ],
                        "name" : [ 
                            "[\"Cafe Ne\", \"Cafe Nf\")", 
                            "[/^Cafe Ne/, /^Cafe Ne/]"
                        ]
                    },
                    "keysExamined" : 6412,
                    "dupsTested" : 0,
                    "dupsDropped" : 0,
                    "seenInvalidated" : 0,
                    "matchTested" : 1
                }
            }, 
            {
                "stage" : "FETCH",
                "nReturned" : 0,
                "executionTimeMillisEstimate" : 40,
                "works" : 7922,
                "advanced" : 0,
                "needTime" : 7921,
                "needFetch" : 0,
                "saveState" : 261,
                "restoreState" : 261,
                "isEOF" : 1,
                "invalidates" : 0,
                "docsExamined" : 0,
                "alreadyHasObj" : 0,
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "filter" : {
                        "TwoDSphereKeyInRegionExpression" : true
                    },
                    "nReturned" : 0,
                    "executionTimeMillisEstimate" : 40,
                    "works" : 7922,
                    "advanced" : 0,
                    "needTime" : 7921,
                    "needFetch" : 0,
                    "saveState" : 261,
                    "restoreState" : 261,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "keyPattern" : {
                        "loc" : "2dsphere",
                        "categoriesIds" : 1,
                        "name" : 1
                    },
                    "indexName" : "loc_2dsphere_categoriesIds_1_name_1",
                    "isMultiKey" : true,
                    "direction" : "forward",
                    "indexBounds" : {
                        "loc" : [ 
                            "[\"2f1003230\", \"2f1003230\"]", 
                            "[\"2f10032300\", \"2f10032300\"]", 
                            "[\"2f100323000\", \"2f100323000\"]", 
                            "[\"2f1003230001\", \"2f1003230001\"]", 
                            "[\"2f10032300011\", \"2f10032300012\")", 
                            "[\"2f10032300012\", \"2f10032300013\")", 
                            "[\"2f1003230002\", \"2f1003230002\"]", 
                            "[\"2f10032300021\", \"2f10032300022\")", 
                            "[\"2f10032300022\", \"2f10032300023\")", 
                            "[\"2f100323003\", \"2f100323003\"]", 
                            "[\"2f1003230031\", \"2f1003230032\")", 
                            "[\"2f1003230032\", \"2f1003230032\"]", 
                            "[\"2f10032300320\", \"2f10032300321\")", 
                            "[\"2f10032300321\", \"2f10032300322\")", 
                            "[\"2f10032300322\", \"2f10032300323\")"
                        ],
                        "categoriesIds" : [ 
                            "[1.0, 1.0]", 
                            "[2.0, 2.0]", 
                            "[26.0, 26.0]", 
                            "[27.0, 27.0]", 
                            "[33.0, 33.0]", 
                            "[43.0, 43.0]", 
                            "[44.0, 44.0]", 
                            "[45.0, 45.0]", 
                            "[53.0, 53.0]", 
                            "[69.0, 69.0]", 
                            "[70.0, 70.0]", 
                            "[71.0, 71.0]", 
                            "[72.0, 72.0]", 
                            "[73.0, 73.0]", 
                            "[74.0, 74.0]", 
                            "[79.0, 79.0]"
                        ],
                        "name" : [ 
                            "[\"Cafe Ne\", \"Cafe Nf\")", 
                            "[/^Cafe Ne/, /^Cafe Ne/]"
                        ]
                    },
                    "keysExamined" : 7921,
                    "dupsTested" : 0,
                    "dupsDropped" : 0,
                    "seenInvalidated" : 0,
                    "matchTested" : 0
                }
            }
        ]
    },

搜索" CafeNeeNNN"时的执行统计数据而不是" Cafe Ne" (Link to the whole explain result

 "executionStats" : {
    "executionSuccess" : true,
    "nReturned" : 0,
    "executionTimeMillis" : 2537,
    "totalKeysExamined" : 232259,
    "totalDocsExamined" : 162658,
    "executionStages" : {
        "stage" : "FETCH",
        "filter" : {
            "$and" : [ 
                {
                    "name" : /^CafeNeeNNN/
                }, 
                {
                    "categoriesIds" : {
                        "$in" : [ 
                            1, 
                            2, 
                            26, 
                            27, 
                            33, 
                            43, 
                            44, 
                            45, 
                            53, 
                            69, 
                            70, 
                            71, 
                            72, 
                            73, 
                            74, 
                            79
                        ]
                    }
                }
            ]
        },
        "nReturned" : 0,
        "executionTimeMillisEstimate" : 1330,
        "works" : 302752,
        "advanced" : 0,
        "needTime" : 302750,
        "needFetch" : 0,
        "saveState" : 4731,
        "restoreState" : 4731,
        "isEOF" : 1,
        "invalidates" : 0,
        "docsExamined" : 70486,
        "alreadyHasObj" : 70486,
        "inputStage" : {
            "stage" : "GEO_NEAR_2DSPHERE",
            "nReturned" : 70486,
            "executionTimeMillisEstimate" : 1290,
            "works" : 302751,
            "advanced" : 70486,
            "needTime" : 232264,
            "needFetch" : 0,
            "saveState" : 4731,
            "restoreState" : 4731,
            "isEOF" : 1,
            "invalidates" : 0,
            "keyPattern" : {
                "loc" : "2dsphere"
            },
            "indexName" : "loc_2dsphere",
            "searchIntervals" : [ 
                {
                    "minDistance" : 0,
                    "maxDistance" : 3408.329295346151,
                    "maxInclusive" : false
                }, 
                {
                    "minDistance" : 3408.329295346151,
                    "maxDistance" : 5000,
                    "maxInclusive" : true
                }
            ],
            "inputStages" : [ 
                {
                    "stage" : "FETCH",
                    "nReturned" : 44540,
                    "executionTimeMillisEstimate" : 110,
                    "works" : 102690,
                    "advanced" : 44540,
                    "needTime" : 58149,
                    "needFetch" : 0,
                    "saveState" : 4731,
                    "restoreState" : 4731,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "docsExamined" : 44540,
                    "alreadyHasObj" : 0,
                    "inputStage" : {
                        "stage" : "IXSCAN",
                        "filter" : {
                            "TwoDSphereKeyInRegionExpression" : true
                        },
                        "nReturned" : 44540,
                        "executionTimeMillisEstimate" : 90,
                        "works" : 102690,
                        "advanced" : 44540,
                        "needTime" : 58149,
                        "needFetch" : 0,
                        "saveState" : 4731,
                        "restoreState" : 4731,
                        "isEOF" : 1,
                        "invalidates" : 0,
                        "keyPattern" : {
                            "loc" : "2dsphere"
                        },
                        "indexName" : "loc_2dsphere",
                        "isMultiKey" : false,
                        "direction" : "forward",
                        "indexBounds" : {
                            "loc" : [ 
                                "[\"2f1003230\", \"2f1003230\"]", 
                                "[\"2f10032300\", \"2f10032300\"]", 
                                "[\"2f100323000\", \"2f100323000\"]", 
                                "[\"2f1003230001\", \"2f1003230001\"]", 
                                "[\"2f10032300012\", \"2f10032300013\")", 
                                "[\"2f1003230002\", \"2f1003230002\"]", 
                                "[\"2f10032300021\", \"2f10032300022\")", 
                                "[\"2f10032300022\", \"2f10032300023\")", 
                                "[\"2f100323003\", \"2f100323003\"]", 
                                "[\"2f1003230031\", \"2f1003230031\"]", 
                                "[\"2f10032300311\", \"2f10032300312\")", 
                                "[\"2f10032300312\", \"2f10032300313\")", 
                                "[\"2f10032300313\", \"2f10032300314\")", 
                                "[\"2f1003230032\", \"2f1003230032\"]", 
                                "[\"2f10032300320\", \"2f10032300321\")", 
                                "[\"2f10032300321\", \"2f10032300322\")"
                            ]
                        },
                        "keysExamined" : 102689,
                        "dupsTested" : 0,
                        "dupsDropped" : 0,
                        "seenInvalidated" : 0,
                        "matchTested" : 44540
                    }
                }, 
                {
                    "stage" : "FETCH",
                    "nReturned" : 47632,
                    "executionTimeMillisEstimate" : 250,
                    "works" : 129571,
                    "advanced" : 47632,
                    "needTime" : 81938,
                    "needFetch" : 0,
                    "saveState" : 2556,
                    "restoreState" : 2556,
                    "isEOF" : 1,
                    "invalidates" : 0,
                    "docsExamined" : 47632,
                    "alreadyHasObj" : 0,
                    "inputStage" : {
                        "stage" : "IXSCAN",
                        "filter" : {
                            "TwoDSphereKeyInRegionExpression" : true
                        },
                        "nReturned" : 47632,
                        "executionTimeMillisEstimate" : 230,
                        "works" : 129571,
                        "advanced" : 47632,
                        "needTime" : 81938,
                        "needFetch" : 0,
                        "saveState" : 2556,
                        "restoreState" : 2556,
                        "isEOF" : 1,
                        "invalidates" : 0,
                        "keyPattern" : {
                            "loc" : "2dsphere"
                        },
                        "indexName" : "loc_2dsphere",
                        "isMultiKey" : false,
                        "direction" : "forward",
                        "indexBounds" : {
                            "loc" : [ 
                                "[\"2f1003230\", \"2f1003230\"]", 
                                "[\"2f10032300\", \"2f10032300\"]", 
                                "[\"2f100323000\", \"2f100323000\"]", 
                                "[\"2f1003230001\", \"2f1003230001\"]", 
                                "[\"2f10032300011\", \"2f10032300012\")", 
                                "[\"2f10032300012\", \"2f10032300013\")", 
                                "[\"2f1003230002\", \"2f1003230002\"]", 
                                "[\"2f10032300021\", \"2f10032300022\")", 
                                "[\"2f10032300022\", \"2f10032300023\")", 
                                "[\"2f100323003\", \"2f100323003\"]", 
                                "[\"2f1003230031\", \"2f1003230032\")", 
                                "[\"2f1003230032\", \"2f1003230032\"]", 
                                "[\"2f10032300320\", \"2f10032300321\")", 
                                "[\"2f10032300321\", \"2f10032300322\")", 
                                "[\"2f10032300322\", \"2f10032300323\")"
                            ]
                        },
                        "keysExamined" : 129570,
                        "dupsTested" : 0,
                        "dupsDropped" : 0,
                        "seenInvalidated" : 0,
                        "matchTested" : 47632
                    }
                }
            ]
        }
    },

集合上的索引

{
"0" : {
    "v" : 1,
    "key" : {
        "_id" : 1
    },
    "name" : "_id_",
    "ns" : "wego.geoData"
},
"1" : {
    "v" : 1,
    "key" : {
        "srcId" : 1
    },
    "name" : "srcId_1",
    "ns" : "wego.geoData"
},
"2" : {
    "v" : 1,
    "key" : {
        "loc" : "2dsphere"
    },
    "name" : "loc_2dsphere",
    "ns" : "wego.geoData",
    "2dsphereIndexVersion" : 2
},
"3" : {
    "v" : 1,
    "key" : {
        "name" : 1
    },
    "name" : "name_1",
    "ns" : "wego.geoData"
},
"4" : {
    "v" : 1,
    "key" : {
        "loc" : "2dsphere",
        "categoriesIds" : 1,
        "name" : 1
    },
    "name" : "loc_2dsphere_categoriesIds_1_name_1",
    "ns" : "wego.geoData",
    "2dsphereIndexVersion" : 2
},
"5" : {
    "v" : 1,
    "key" : {
        "loc" : "2dsphere",
        "categoriesIds" : 1,
        "keywords" : 1
    },
    "name" : "loc_2dsphere_categoriesIds_1_keywords_1",
    "ns" : "wego.geoData",
    "2dsphereIndexVersion" : 2
}
}

Collection stats link

4 个答案:

答案 0 :(得分:10)

我将在这里推测一下,然后评论一下你的设计。

首先,当您在key上创建一个索引,该索引在一个值上有一个数组时,您将为该数组的每个元素创建一个记录:

  

要索引包含数组值的字段,MongoDB会创建一个索引   数组中每个元素的键。

这是MongoDB own documentation about indecies

所以,如果你的典型记录超过一手牌,并且你有700万条记录, 你的索引是巨大的,扫描索引本身也需要时间来发现索引不包含你要查找的内容。它还是 比收集扫描更快,但与查找现有记录的速度相比,它的速度很慢。

现在,让我评论一下您的架构设计。这是一个风格问题,所以可以随意忽略这一部分。

您的记录可能属于17个类别。这有点压倒性,而且滥用了术语category。类别是特定的 分裂,一种快速将事物与一组事物联系起来的方法。属于这么多群体的东西是什么? 我们以您的记录Cafe Ne为例。我假设在现实世界中 - 请记住,编程和应用程序充其量只能解决现实世界的问题 - Cafe Ne,无论是餐厅,咖啡馆,爵士酒吧, 晚餐。肯定不是车库(除非,咖啡馆意味着用我不知道的语言的汽车)。我很难想象这是一家银行或牙科诊所。我必须真正努力,找到超过10个有意义的类别,用户搜索咖啡馆。

我的观点是,即使mongodb允许你设计这样的东西,但这并不意味着你必须这样做。尝试缩小您拥有的类别数量和您寻找的类别数量,您将获得更好的表现。

答案 1 :(得分:2)

正如JohnnyHK在评论中提出的那样,Oz123在他的回答中指出,这里的问题似乎是一个已经变得如此之大以至于它作为索引表现不佳的索引。我相信除了已经指出的类别扩展问题之外,索引中字段的排序也会产生麻烦。复合索引为built according to the order of fields,在name之后放置categoriesIds会使name上的查询成本更高。

很明显,您需要调整索引。具体如何调整它们取决于您期望支持的查询类型。特别是,我不确定您是否会从locname的复合索引中看到更好的效果,或者如果您从单个索引中看到更好的效果,用于loc,一个用于name。当使用复合索引时最好使用单个索引并依赖索引交集时,Mongo本身是a little vague

我的直觉说个别索引会表现更好,但我会测试两种情况。

如果您预计还需要按类别进行查询,如果没有可能缩小查询范围的nameloc字段,则最好创建单独的categoriesIds指数。

答案 2 :(得分:1)

复合索引中字段的顺序非常重要。在没有访问真实数据和使用模式的情况下很难诊断,但是这个密钥可能会增加仅使用索引匹配(或不匹配)文档的几率:

{
  "loc" : "2dsphere",
  "name" : 1,
  "categoriesIds" : 1
}

答案 3 :(得分:0)

不确定是否是完全相同的问题,但是在没有找到结果的情况下,多键索引的性能却很相似。

这实际上是v3.3.8中修复的Mongo错误。 https://jira.mongodb.org/browse/SERVER-15086

我们在升级Mongo并重建索引后解决了问题。