MongoDB聚合组按类似字符串

时间:2018-09-27 08:51:11

标签: mongodb mongodb-query aggregation-framework

我开始学习Mongo的汇总,但在我的项目中,我发现了很多品牌非常相似的品牌,例如“ BrandA”和“ BrandA tech”。我可以在汇总结束时对它们进行分组吗?

我的数据库中有2个收藏夹:

第一个是针对品牌的:

{
  _id: ObjectId(),
  name: String
}

第二个用于产品:

{
  _id: ObjectId(),
  name: String,
  brand: ObjectId() // referring to _id of brands
}

现在可以说我拥有以下品牌:

{_id: ObjectId('5a9fd2b8045b020013de2a47'), name: 'brand1'},
{_id: ObjectId('5a9fcf94d28420245451a39c'), name: 'brand2'},
{_id: ObjectId('5a9fcf94d28420245451a39a'), name: 'brand1 sub1'},
{_id: ObjectId('5a9fe8bf045b020013de2a6d'), name: 'sub2 brand2'}

以及以下产品:

{_id: ObjectId(''), name: 'item1', brand: ObjectId('5a9fd2b8045b020013de2a47')},
{_id: ObjectId(''), name: 'item2', brand: ObjectId('5a9fcf94d28420245451a39c')},
{_id: ObjectId(''), name: 'item3', brand: ObjectId('5a9fd2b8045b020013de2a47')},
{_id: ObjectId(''), name: 'item4', brand: ObjectId('5a9fcf94d28420245451a39a')},
{_id: ObjectId(''), name: 'item5', brand: ObjectId('5a9fe8bf045b020013de2a6d')},
{_id: ObjectId(''), name: 'item6', brand: ObjectId('5a9fd2b8045b020013de2a47')},
{_id: ObjectId(''), name: 'item7', brand: ObjectId('5a9fcf94d28420245451a39c')},
{_id: ObjectId(''), name: 'item8', brand: ObjectId('5a9fcf94d28420245451a39a')}

我现在拥有的查询:

db.getCollection('products').aggregate([
  {$group: {
    _id: '$brand',
    amount: { $sum: 1 },
  }},
  {
    $sort: { 'amount': -1 }
  },{$lookup: {
    from: 'brands',
    localField: '_id',
    foreignField: '_id',
    as: 'lookup'
  }},
  {$unwind: {path: '$lookup'}},
  {$project: {
    _id: '$_id',
    brandName: '$lookup.name',
    amount: '$amount'
  }}
]);

结果:

{_id: ObjectId('5a9fd2b8045b020013de2a47'), brandName: 'brand1', amount: 3}
{_id: ObjectId('5a9fcf94d28420245451a39c'), brandName: 'brand2', amount: 2}
{_id: ObjectId('5a9fcf94d28420245451a39a'), brandName: 'brand1 sub1', amount: 2}
{_id: ObjectId('5a9fe8bf045b020013de2a6d'), brandName: 'sub2 brand2', amount: 1}

我想要的结果:

{_id: ObjectId(null), brandName: 'brand1', amount: 5},
{_id: ObjectId(null), brandName: 'brand2', amount: 3}

是否可以通过在brandName中找到相似的字符串来对我现在得到的结果进行分组?像将“ brand1”和“ brand1 sub1”或“ brand2”和“ sub2 brand2”分组一样?

2 个答案:

答案 0 :(得分:0)

我认为您可以使用$split$unwind

做您想做的事

split会将您的字符串转换为单词数组,而unwind将创建与数组中单词数量相同的条目。

然后,您可以应用已经准备好的管道来计算发生次数。

答案 1 :(得分:0)

更改模型可以轻松实现这一目标。只需将商品添加到品牌中即可。 那么您可以通过使用数组的长度立即获得计数,并且查询速度更快。