正确地将聚合管道中的文档分组以便找到setintersection

时间:2015-02-05 08:38:51

标签: mongodb

说我有这两个文件:

{  
   "_id":"sampleA",
   "value":{  
      "data":[  
         {  
            "thing":"A"
         },
         {  
            "thing":"B"
         },
         {  
            "thing":"C"
         },
         {  
            "thing":"D"
         },
         {  
            "thing":"E"
         }
      ]
   }
}

 {  
   "_id":"sampleB",
   "value":{  
      "data":[  
         {  
            "thing":"C"
         },
         {  
            "thing":"D"
         },
         {  
            "thing":"E"
         },
         {  
            "thing":"F"
         }
      ]
   }
}

我希望将它们分组到一个文档中,保留“sampleA”或“sampleB”的标签,例如

{
  "_id": null,
  "sampleA": [
    {
      "thing": "A"
    },
    {
      "thing": "B"
    },
    {
      "thing": "C"
    },
    {
      "thing": "D"
    },
    {
      "thing": "E"
    }
  ],
  "sampleB": [
    {
      "thing": "C"
    },
    {
      "thing": "D"
    },
    {
      "thing": "E"
    },
    {
      "thing": "F"
    }
  ]
}

这样我可以使用set intersection运算符。我该怎么做呢?我试过了:

db.testz.aggregate(
      [{
        $match: {
          _id: {
            $in: ["sampleA", "sampleB"]
          }
        }
      }, {
        '$group': {
          _id: null,
          a: {
            $push: "$value"
          }
        }
      }]
    );

给了我

{
  "_id": null,
  "a": [
    {
      "data": [
        {
          "thing": "A"
        },
        {
          "thing": "B"
        },
        {
          "thing": "C"
        },
        {
          "thing": "D"
        },
        {
          "thing": "E"
        }
      ]
    },
    {
      "data": [
        {
          "thing": "C"
        },
        {
          "thing": "D"
        },
        {
          "thing": "E"
        },
        {
          "thing": "F"
        }
      ]
    }
  ]
}

如果我可以索引?

中的项目,我可能会使用set intersection运算符
    db.testz.aggregate(
      [{
        $match: {
          _id: {
            $in: ["sampleA", "sampleB"]
          }
        }
      }, {
        '$group': {
          _id: null,
          a: {
            $push: "$value"
          }
        }
      }, {
        '$project': {
          int: {
            $setIntersection: ["$a.0", "$a.1"]
          }
        }
      }]
    );

^^显然这里的最后一步不起作用,但我试图说明这一点。

1 个答案:

答案 0 :(得分:0)

我认为目前唯一的方法(MongoDB 2.6)是展开数组然后在集合中重新收集:

> db.testz.aggregate([
    { "$match" : { "_id" : { "$in" : ["sampleA", "sampleB"] } } },
    { "$unwind" : "$value.data" },
    { "$group" : { "_id" : 0, "intersection" : { "$addToSet" : "$value.data" } } }
])

这不是一种有效的方法,但它可以完成工作。我正在探测你更具体的信息,看看有没有办法避免这个答案:(