我如何按_id和国家分组?

时间:2012-07-26 01:01:33

标签: ruby-on-rails ruby ruby-on-rails-3 mongodb aggregation-framework

我需要按_id和国家分组。我已成功按_id进行分组,但想知道如何对这些_id中的国家/地区进行分组,并返回每个国家/地区的计数。

我正在使用聚合框架。到目前为止一切都很好。

conn = Mongo::Connection.new
db   = conn['foobar_development']

cmd = {
  aggregate: 'live_daily_stats',
  pipeline: [
    { '$project' => {
      :metacontent => 1,
      :visits => 1,
    } },
    { '$unwind' => '$visits' },
    { '$match' => { 'visits.minute' => { '$gt' => 224 } } },
    { '$sort' => { 'visits.minute' => 1 } },
    { '$group' => { 
      :_id => '$_id', 
      :visits => { '$push' => '$visits' }, 
      :visits_count => { '$sum' => 1 },
      :metacontent => { '$addToSet' => '$metacontent' },
      } 
    },
    { '$sort' => { 'visits_count' => -1 } },
  ]
}

res = db.command(cmd)['result']

以下内容返回:

[
    [0] {
                 "_id" => "20120726/foobar/song/custom-cred",
              "visits" => [
            [0] {
                                              "country_name" => "UK",
                               "iso_two_letter_country_code" => "UK",
                                                   "referer" => "http://localhost:3000/",
                                                    "minute" => 59,
                                                  "token_id" => "134326199711wfryhpdq"
            },
            [1] {
                                              "country_name" => "UK",
                               "iso_two_letter_country_code" => "UK",
                                                   "referer" => "http://localhost:3000/",
                                                    "minute" => 59,
                                                  "token_id" => "134326199711wfryhpdq"
            },
            [2] {
                                              "country_name" => "US",
                               "iso_two_letter_country_code" => "US",
                                                   "referer" => "http://localhost:3000/",
                                                    "minute" => 59,
                                                  "token_id" => "134326199711wfryhpdq"
            }
        ],
        "visits_count" => 1,
         "metacontent" => [
            [0] {
                                     "date" => "20120726"
            }
        ]
    },
    [1] {
                 "_id" => "20120725/foobar/song/test-pg3-long-title-here-test-lorem-ipsum-dolor-lo",
              "visits" => [
            [0] {
                                              "country_name" => "UK",
                               "iso_two_letter_country_code" => "UK",
                                                   "referer" => "http://localhost:3000/",
                                                    "minute" => 58,
                                                  "token_id" => "13432600883knjzcbic"
            }
        ],
        "visits_count" => 1,
         "metacontent" => [
            [0] {
                                     "date" => "20120725"
            }
        ]
    }
]

2 个答案:

答案 0 :(得分:0)

来自the documentation

  

$ group将文档组合在一起以进行计算   基于文档集合的聚合值。几乎,   组通常支持每个页面的平均页面查看等任务   每天都有一个网站。

     

$ group的输出取决于您如何定义组。从头开始   为您所在的组指定标识符(即_id字段)   使用此管道创建。您可以从中指定单个字段   管道中的文档,先前计算的值或   聚合键由几个传入的字段组成。

     

每个组表达式都必须指定_id字段。你可以指定   _id字段为虚线字段路径引用,包含大括号(即{和})的多个字段的文档,或常量值。

我首先尝试按_id和country进行分组(让你做你想要的计数),然后按_id对结果进行分组,得到你想要的结构。

更新:

我在想这样的事情......但是我没有设置env设置来检查..

    conn = Mongo::Connection.new
    db   = conn['foobar_development']

    cmd = {
      aggregate: 'live_daily_stats',
      pipeline: [
        { '$project' => {
          :metacontent => 1,
          :visits => 1,
        } },
        { '$unwind' => '$visits' },
        { '$match' => { 'visits.minute' => { '$gt' => 224 } } },
        { '$sort' => { 'visits.minute' => 1 } },
        { '$group' => { 
          :_id => {'$_id','$visits.iso_two_letter_country_code'},
          :page_id => '$_id',
          :visits_count => { '$sum' => 1 },
   .... whatever you want ...
          :metacontent => { '$addToSet' => '$metacontent' },
          } 
        },
        { '$group' => { 
          :_id => '$page_id', 
   .... whatever you want ...
          } 
        },
        { '$sort' => { 'visits_count' => -1 } },
      ]
    }

    res = db.command(cmd)['result']

答案 1 :(得分:0)

我更改了$group以连接_idcountry_name

cmd = {
  aggregate: 'live_daily_stats',
  pipeline: [
    { '$project' => {
      :metacontent => 1,
      :visits => 1,
    } },
    { '$unwind' => '$visits' },
    { '$match' => { 'visits.minute' => { '$gt' => 224 } } },
    { '$sort' => { 'visits.minute' => 1 } },
    { '$group' => { 
      :_id => { '$add' => ['$_id', '$visits.country_name']}, 
      :visits => { '$push' => '$visits' }, 
      :visits_count => { '$sum' => 1 },
      :metacontent => { '$addToSet' => '$metacontent' },
      } 
    },
    { '$sort' => { 'visits_count' => -1 } },
  ]
}