mongo和spring-data-mongo中的聚合查询

时间:2016-12-16 15:52:15

标签: mongodb mongodb-query spring-data aggregation-framework spring-data-mongodb

大家好,我在查询数据方面遇到了很大问题。我有这样的文件:

{
    "_id" : NumberLong(999789748357864),
    "text" : "#asd #weila #asd2 welcome in my house",
    "date" : ISODate("2016-12-13T21:44:37.000Z"),
    "dateString" : "2016-12-13",
    "hashtags" : [ 
        "asd", 
        "weila", 
        "asd2"
    ]
}

我想构建两个查询:

1)每天计算主题标签的数量,然后退出例如:

{_id:"2016-12-13",
hashtags:[
{hashtag:"asd",count:20},
{hashtag:"weila",count:18},
{hashtag:"asd2",count:10},
....
]
}

{_id:"2016-12-14",
hashtags:[
{hashtag:"asd",count:18},
{hashtag:"asd2",count:14},
{hashtag:"weila",count:10},
....
]
}

2)另一个是相同的,但我想设定一个2016-12-13至2016-12-17的期间。

对于第一个我编写此查询并获得我搜索的内容但是在Spring Data Mongo中我不知道如何编写。

db.comment.aggregate([
{$unwind:"$hashtags"},
{"$group":{
    "_id":{ 
        "date" : "$dateString",
        "hashtag": "$hashtags"
    },
    "count":{"$sum":1}
    }
},
{"$group":{
    "_id": "$_id.date",
    "hashtags": { 
       "$push": { 
       "hashtag": "$_id.hashtag",
       "count": "$count"
     }},
     "count": { "$sum": "$count" }
}},
{"$sort": { count: -1}},
{"$unwind": "$hashtags"},
{"$sort": { "count": -1, "hashtags.count": -1}},
{"$group": {
        "_id": "$_id",
        "hashtags": { "$push": "$hashtags" },
        "count": { "$first": "$count" }
    }},
{$project:{name:1,hashtags: { $slice: ["$hashtags", 2 ]}}}
]);

1 个答案:

答案 0 :(得分:0)

您仍然可以使用相同聚合操作的一小部分减去第二组阶段之后的管道步骤但是对于过滤方面,您必须在初始 $match <中引入日期范围查询/ strong>管道步骤。

以下mongo shell示例 显示如何过滤特定日期范围的聚合:

1)设定2016-12-13至2016-12-14的期间:

var startDate = new Date("2016-12-13");
startDate.setHours(0,0,0,0);

var endDate = new Date("2016-12-14");
endDate.setHours(23,59,59,999);
var pipeline = [
    { 
        "$match": {
            "date": { "$gte": startDate, "$lte": endDate }
        }
    }
    { "$unwind": "$hashtags" },
    {
        "$group": {
            "_id": {
                "date": "$dateString",
                "hashtag": "$hashtags"
            },
            "count": { "$sum": 1 }
        }
    },
    {
        "$group": {
            "_id": "$_id.date",
            "hashtags": { 
                "$push": { 
                    "hashtag": "$_id.hashtag",
                    "count": "$count"
                }
            }
        }
    }
]
db.comment.aggregate(pipeline)

2)设定2016-12-13至2016-12-17的期间:

var startDate = new Date("2016-12-13");
startDate.setHours(0,0,0,0);

var endDate = new Date("2016-12-17");
endDate.setHours(23,59,59,999);
// run the same pipeline as above but with the date range query set as required

Spring Data Equivalent (未经测试):

import static org.springframework.data.mongodb.core.aggregation.Aggregation.*;

Aggregation agg = newAggregation(
    match(Criteria.where("date").gte(startDate).lte(endDate)),
    unwind("hashtags"),
    group("dateString", "hashtags").count().as("count"),
    group("_id.dateString")
        .push(new BasicDBObject
            ("hashtag", "$_id.hashtags").append
            ("count", "$count")
        ).as("hashtags") 
);
AggregationResults<Comment> results = mongoTemplate.aggregate(agg, Comment.class); 
List<Comment> comments = results.getMappedResults();