MongoDB Aggregate如何配对相关记录以进行处理

时间:2019-10-24 06:08:38

标签: mongodb aggregate

我在MongoDB数据库中捕获了一些事件数据,其中一些事件是成对发生的。

例如:DOOR_OPEN和DOOR_CLOSE是两个成对发生的事件

事件集合:

{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: t }
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: t+5 }
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:t+10 }
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:t+30 }
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:t+35 }
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:t+40 }
...

假设记录按时间戳排序,则_id:1和_id:3是“用户1”的“对”。_id:2和“用户2”的_id:6。

我想对每个用户使用所有这些DOOR_OPEN和DOOR_CLOSE对,并计算平均持续时间等。每个用户都已打开门。

这可以使用聚合框架来实现吗?

2 个答案:

答案 0 :(得分:2)

您可以使用php artian migrate:fresh和$ g​​roup来实现此目的。

$lookup

样本数据:

db.getCollection('TestColl').aggregate([
{ $match: {"name": { $in: [ "DOOR_OPEN", "DOOR_CLOSE" ] } }},
{ $lookup:
       {
         from: "TestColl",
         let: { userID_lu: "$userID", name_lu: "$name", timestamp_lu :"$timestamp" },
         pipeline: [
              { $match:
                 { $expr:
                    { $and:
                       [
                         { $eq: [ "$userID",  "$$userID_lu" ] },
                         { $eq: [ "$$name_lu", "DOOR_OPEN" ]},
                         { $eq: [ "$name", "DOOR_CLOSE" ]},
                         { $gt: [ "$timestamp", "$$timestamp_lu" ] }
                       ]
                    }
                 }
              },              
           ],
         as: "close_dates"
       }
},
{ $addFields: { "close_time": { $arrayElemAt: [ "$close_dates.timestamp", 0 ] }  } },
{ $addFields: { "time_diff": { $divide: [ { $subtract: [ "$close_time", "$timestamp" ] }, 1000 * 60 ]} } }, // Minutes
{ $group: { _id: "$userID" , 
    events: { $push: { "eventId": "$_id", "name": "$name",  "timestamp": "$timestamp" } },
    averageTimestamp: {$avg: "$time_diff"}
    }
}
])

结果:

[
{ _id: 1, name: "DOOR_OPEN", userID: "user1", timestamp: ISODate("2019-10-24T08:00:00Z") },
{ _id: 2, name: "DOOR_OPEN", userID: "user2", timestamp: ISODate("2019-10-24T08:05:00Z") },
{ _id: 3, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:10:00Z") },
{ _id: 4, name: "DOOR_OPEN", userID: "user1", timestamp:ISODate("2019-10-24T08:30:00Z") },
{ _id: 5, name: "SOME_OTHER_EVENT", userID: "user3", timestamp:ISODate("2019-10-24T08:35:00Z") },
{ _id: 6, name: "DOOR_CLOSE", userID: "user2", timestamp:ISODate("2019-10-24T08:40:00Z") },
{ _id: 7, name: "DOOR_CLOSE", userID: "user1", timestamp:ISODate("2019-10-24T08:50:00Z") },
{ _id: 8, name: "DOOR_OPEN", userID: "user2", timestamp:ISODate("2019-10-24T08:55:00Z") }
]

答案 1 :(得分:1)

您可以使用聚合框架的$group运算符按用户ID分组并计算平均值:

db.events.aggregate([{
    $group: {
        _id: "$userID",
        averageTimestamp: {$avg: "$timestamp"}
    }
}]);

如果您还想舍弃DOOR_OPEN或DOOR_CLOSED以外的任何其他事件,则可以在聚合管道中添加一个过滤器并添加$match

db.events.aggregate([{
    $match: {
        $or: [{name: "DOOR_OPEN"},{name: "DOOR_CLOSE"}]
    }
}, {
    $group: {
        _id: "$userID",
        averageTimestamp: {$avg: "$timestamp"}
    }
}]);
相关问题